Application No. 09/830,669 

Reply to Office Action of April 24, 2007 

SUPPORT FOR THE AMENDMENTS 

The specification has been amended to insert the address of the depository and to 

replace the Abstract. The amendments to the claims are supported by the specification. 

Accordingly, no new matter is believed to have been added to the present application by the 

amendments submitted above. 
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Application No. 09/830,669 

Reply to Office Action of April 24, 2007 

REMARKS 

Claims 86-104 and 106-1 18 are pending. Favorable reconsideration is respectfully 
requested. 

The rejections of the claims under 35 U.S.C. §112, first paragraph, are respectfully 
traversed. 

The present specification provides a detailed description of the procedure for 
conducting the claimed method. In view of that description, one would appreciate that the 
invention as claimed is described and could be practiced with routine experimentation. 

In addition. Applicants submit herewith publications from the scientific literature 
which demonstrate that valyl-tRNA synthetase genes had been extensively reported prior to 
the filing date of the present application, 

Jordana et al. (J. BioL Chem, (1987) 262(15): 7189-94) demonstrate that as early as 
1987 the sequence of the valyl-tRNA synthetase gene in Saccharomyces cerevisiae (yeast) 
and reported a high level of homology between both yeast and bacteria aminoacyl™RNA 
genes. 

Heck et al. (J. Biol Chem, (1988) 26(2): 868-877) disclosed in 1988 the cloning and 
sequencing of ValS of E. coli. These authors also found that ValS was highly related with 
yeast valyl-tRNA genes. 

Brown et al. (PNAS (1995) 92:2441-45) sequenced, in 1995, isoleucyMRNA genes in 
a large array of microbial species and demonstrated that this family of genes probably 
expanded through the species by duplication. 

Luo et al. (J. Bact. (1997) 179(8): 2472-2478) sequenced ValS in B. suhtilis and found 
further similarities between valyl-tRNA synthetases from Bacillus subtilis. Bacillus 
stearothermophilus y Lactobacillus casei and Escherichia coli. 
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Application No. 09/830,669 

Reply to Office Action of April 24, 2007 

The publications submitted herewith demonstrate that it was well-established in the 
art at the time the present application was filed that valyl-tRNA synthetase genes shared 
strong similarities throughout bacteria and yeast. 

In view of the foregoing, the present specification describes and enables the claimed 
method. Accordingly, withdrawal of this ground of this ground of rejection is respectfully 
requested. 

Regarding the issue with respect to biological deposit of the subject matter of Claim 
106, Applicants confirm that the deposits were made under the terms of the Budapest Treaty. 
Copies of the deposit receipts are submitted herewith. The complete address for the 
depository has been added to the specification. 

Withdrawal of this ground of rejection is respectfully requested. 

The rejection of the claims under 35 U.S.C. §112, second paragraph, is believed to be 
obviated by the amendment submitted above. The claims have been amended as suggested 
by the Examiner in order to address the issues raised in the Office Action. In view of the 
foregoing, the claims are definite within the meaning of 35 U.S.C. §1 12, second paragraph. 
Accordingly, withdrawal of this ground of rejection is respectfully requested. 

Applicants submit that the present application is in condition for allowance. Early 
notice to this effect is earnestly solicited. 



Respectfully submitted. 



OBLON, SPIVAK, McCLELLAND, 
MAIER & NEUSTADT, P.C. 



Customer Number 



22850 




Tel: (703)413-3000 
Fax: (703)413 -2220 

(OSMMN 06/04) 
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TRAITE DE BUDAPEST SUR LA RECONNAISSANCE 
INTERNATIONALE DU DEPOT DES MICRO-OKGANISMES 
AUX FINS DE LA PROCEDURE EN HATIERE DE BREVETS 



FORMULE INTERNATIONALE 



DESTINATAIRE 



INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
76015 PARIS 



RECEPISSE EN CAS DE DEPOT INITIAL, 
delivre en vertu de la regie 7.1 par 
I'AUTORITE DE DEPOT INTERNATIONALE 
Identifiee au bas de cette page 



L_ 



NOM ET ADRESSE 
DU DEPOSANT 



I. IDENTIFICATION DU MICRO-ORGANISME 


Reference d ' Identification donnee par le 
DEPOSANT : 

P5485 


Nutnero d'ordre attribue par 
1»AUT0RITE DE DEPOT INTERNATIONALE : 

1 - 2340 



II. DESCRIPTION SCIENTIFIQUE ET/OU DESIGNATION TAXONOMIQUE PROPOSES 



Le micro-organisme identifle sous chiffre I etait accompagne 
d'une description scientifique 
d'une designation taxonomique proposee 



(Cocher ce qui convient) 



III, RECEPTION ET ACCEPTATION 



La presents autorite de d^pot Internationale accepte le micro-organisme identifie sous 
chiffre I, qu'elle a regu le 26 OCTOBRE 1999 (date du depot initial)^ 



IV. RECEPTION D'UNE REQUETE EN CONVERSION 



La presente autorite de depot Internationale a re9U le micro-organisiae identifie sous 
chiffre I le (date du depot initial) 

et a regu une requite en conversion du dep6t initial en depot conforme au Traite de 
Budapest le (date de reception de la requite en conversion) 



AUTORITE DE DEPOT INTERNATIONALE 



Norn 



Adresse 



CNCM 

CoHection National© 

de Cultures de Microorganismes 

INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) de la (des) personne(s) 

competente ( s ) pour repr^senter l*autorite 

de depot Internationale ou de l*(des) 

employe (s) autorise(s) ; SirnofiaO^EN 

"l^deur deip CNCM 



Date 



Paris, le 30 novehribr^1999 



1 En cas d * application de la rfegle 6.4.d), cette date est la date a laquelle le statut 
d*autorite de depot Internationale a ^te acquis. 



Formule BP/ 4 (page unique) 



TRAITE DE BUDAPEST SUR LA RECONNAISSANCE 
INTERNATIONALE DU DEPOT 0ES MICRO-ORGANISMES 
AUX FINS DE LA PROCEDURE EN MATIERE DE BREVETS 



FORHULE INTERNATIONALE 



DESTINATAIRE 



Madame Danielle BERNEMAN, 
Bureau des Brevets et Inventions 
INSTITUT PASTEUR 
25-28, rue du Docteur Roux 
75724 PARIS CEDEX 15 



DECLARATION SUR LA VIABILITE, 
delivree eji vertu de la regie 10.2 par 
I'AUTORITE DE DEPOT INTERNATIONALE 
idetitifi^e a la page suivante 



NOM ET ADRESSE DE LA PARTIE 
I A LAQUELLE LA DECLARATION SUR LA ) 
I VIABILITE EST DELIVREE I 



I. DEPOSANT 


II. IDENTIFICATION DU HICEO-ORGANISME 


Nom ! INSTITUT PASTEUR 


Numero d'ordre attrlbue par 
I'AUTORITE DE DEPOT INTERNATIONALE : 


Adresse : Bureau des Brevets et Inventfotts 
25-28, rue du Docteur Roux 
75015 PARIS 


1-2340 

Date du d^pot ou du tratisfert ^ : 

26 OCTOBRE 1999 


III. DECLARATION SUR LA VIABILITE 


La viabilite du mlcro-organisme Identlfie sous chiffre II a et4 contrdlee 
le 27 OCTOBRE 1999 ^* a cette date, le micro-organisme 


^3 

j^H^j etait viable 




1 j n' etait plus viable 





1 Indiquer la date du depot initial ou, si un nouveau depot ou un transfert ont ete 
effectues, la plus recente des dates pertinentss (date du nouveau d^pot ou date du transfert). 

2 Dans les cas vises h la regie 10.2.a)ii} et lii) , mentionner le controle de viabilite 
le plus recent. 



3 Cocher la' case qui convient. 



Eormule BP/9 (premiere page) 
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IV. CONDITIONS DANS LESQUELLES LE CONTROLE DE VIABILITE A ETE EfFECTUE 



V. AUTORITE BE DEPOT INTERNATIONALE 



Nom 



Adresse : 



CNCM 



Conection Nationale 

de Cultures de Microorganismes 



INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 
FRANCE 



Signature ( s) de la (des) persoaae(s) 
competente ( s ) pour representer I'autorit^ 
de depot internationale ou de l*(des) 
employees) autorise(s) : 



Simona OZDEN 

Directeitf de la CNCM 




Georges WAGENER 

ConseiHer Scientifique de !a CNCM 
pourjbas bacieries 



Date ! Paris, le ^ novel 




4 A rempllr si cette information a ^te demandee et si les resultats du controle etaient 
aegatlfs. 



Formule BP/9 (deuxiSme et derniere page) 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 



INTERNATIONAL SYSTEM 



RECIPIENTS: 




INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
26-28, rue du Docteur Roux 
76015 PARIS 


REGblP 1 rUK INI 1 lAL UfcPUoi 1 , 

issued in accordance with rule 7.1 by the 
INTERNATIONAL DEPOSIT AUTHORITY 
identified at the bottom of this page 


NAME AND ADDRESS OF 
DEPOSITOR 




L IDENTIFICATION OF THE MICROORGANISM 




Identification reference given by the 
DEPOSITOR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 


p5485 


1-2340 


II. SCIENTIFIC DESCRIPTION AND/OR PROPOSED TAXONOMIC DESIGNATION 


The microorganism identified under heading 1 was accompanied: 


H By a scientific description 




B By a proposed taxonomic description 




(Check the appropriate box) 




III. RECEIPT AND ACCEPTANCE 


jhe present International Deposit Authority accepts the microorganism identified under heading 1, which it 
received on October 26, 1 999 (date of the initial deposit) 


IV. RECEIPT OF A REQUEST FOR CONVERSION 


The present International Deposit Authority received the microorganism identified under heading 1, which it 
received on (date of the initial deposit) 
and received a conversion request of the initial deposit into a deposit which conforms to the Budapest Treaty on 

(date of the receipt of conversion request) 


V. INTERNATIONAL DEPOSIT AUTHORITY 


Name CNCM 

Collection Nationale de Cultures 
De Microorganismes 


Signature(s) of the person(s) competent to represent the 
International Deposit Authority or the authorized 
employee(s): Simona OZDEN 

Director of CNCM 


Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-75724 PARIS CEDEX 16 


[signature] 
Date : Paris, November 30, 1999 



1 |ln the case of application of rule 6.4,d), this date is the date on which the authorizing statute for 
[international deposit was acquired. 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 



INTERNATIONAL SYSTEM 



RECIPIENTS. 




INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
76015 PARIS 


DECLARATION ON VIABILITY 

issued in accordance with rule 10.2 y the 

identified on the following page 


NAME AND ADDRESS OF 
DEPOSITOR 




1 - Depositor 


II. Identification of the microorganism 


Name : INSTITUT PASTEUR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 


Address: Bureau des Brevets et Inventions 
25-28 rue du Docteur Roux 
75015 PARIS 


1-2340 

Date of the deposit^ : 

OCTOBER 26, 1999 


II. DECLARATION ON THE VIABILITY 


The viability of the microorganism identified under heading 11 was controlled 
On OCTOBER 27, 1999 ^ At this date the microorganism 


B was viable^ 




CZI was no more viable ^ 





^' : Indicate the initial date of the deposit or, if a new deposit or a transfert has been done, the most recent of 
the relevant dates 

^' : In the cases referred to in Rule 10,2a)ii) and iii), mention the most recent viability control. 
^" : Tick the appropriate box 



IV. CONDITIONS OF THE VIABILITY CONTROL'* 



V. INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationale de Cultures 
De Microorganismes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) of the person(s) competent to represent the 
International Deposit Authority or the authorized 
employee(s): 



Yvanne CERISIER 

Administrative CNCM fVIanager 
[signature] 



Georges WAGENER 

CNCM Scientific Adviser for Bacteria 

[signature] 



Date : Paris, November 30, 1999 



only fill this part when the control is negative 



TRAITE DE BUDAPEST SUR LA RECONNAISSANCE 
INTERNATIONALE DU DEPOT DES' MICRO-ORGANISMES 
AUX EINS DE LA PROCEDURE EN MATIERE DE BREVETS 



FORMULE INTERNATIONALE 



I DE 



DESTINATAIRE 



~1 



INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
76016 PARIS 



RECEPISSE EN CAS DE DEPOT INITIAL, 
delivre en vertu de la regie 7.1 par 
I'AUTORITE DE DEPOT INTERNATIONALE 
identifiee au bas de cette page 



I NOM ET ADRESSE I 
' DU DEPOSANT 



I. 



IDENTIEICATION DU MICRO-ORGANISME 



Reference d' identification donaee par le 
DEPOSANT : 

P5479 


Numero d*ordre attribu^ par 
I'AUTORITE DE DEPOT INTERNATIONALE : 

1 - 2339 


II, DESCRIPTION SCIENTIFIQUE ET/OU DESIGNATION TAXONOHIQUE PROPOSEE 



Le tnicro-organisme Identifie sous chiffre I etalt accompagnS 
d'tine description scientifique 
d*une designation taxonotaique propos^e 



(Cocher ce qui convient) 



III. RECEPTION ET ACCEPTATION 



La presente autorite do d€p6t Internationale accepts le inicro-organisme identifie sous 
chiffre I, qu*elle a re^-U le 26 OCTOBRE 1999 (date du depOt initial)-^ 



IV. RECEPTION D*UNE REQUETE EN CONVERSION 



La presente autorite de depot Internationale a reg-u le micro-organisme Identifie sous 
chiffre I le (date du depot Initial) 

et a re9U une requete en conversion du depot Initial en depot conforme au Tralte de 
Budapest le 



(date de reception de la requite en conversion) 



V. AUTORITE DE DEPOT INTERNATIONALE 



Nom 



Adresse 



CNCM 

Collection Nationale 

de Cultures de Microorganismes 

INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) de la (des) personne(s) 
competente ( s ) pour representer l*autorit^ 
de depot Internationale ou de l'((ies) 
employe (s) autorise(s) : SimopaOZDEN 



Date ! Paris, le 30 novembie 




1 En cas d ' application de la rfegle 6,4. d), cette date est la date a laquelle le statut 
d'autorite de depdt Internationale a ete acquis. 



Formule BP/A (page unique) 



TRAITE DE BUDAPEST SUE LA RECONNAISSANCE 
INTERKATIOHALE DU DEPOT DES MICRO-ORGANISMES 
AUX FIHS DE LA PROCEDURE EN MATIERE DE BREVETS 



FORMULE INTERNATIONALE 



I DE 



DESTINATAIRE 



Madame Danielle BERNEMAN, 
Bureau des Brevets et Inventions 
INSTITUT PASTEUR 
25-28, rue du Docteur Roux 
76724 PARIS CEDEX 16 



DECLARATION SUR LA VIABILITE, 
delivree en vertu de la regie 10.2 par 
I'AUTORITE DE DEPOT INTERNATIONALE 
Identiiflee a la page sulvante 



NOM ET ADRESSE DE LA PARTIE 
A LAQUELLE LA DECLARATION SUR LA 
VIABILITE EST DELIVREE 



I. 


DEPOSANT 




II. IDENTIFICATION DU MICRO-ORGANISME 


Nom 


INSTITUT PASTEUR 




Ntimero d*ordre attribue par 
I'AUTORITE DE DEPOT INTERNATIONALE : 


Adr€ 


isse : Bureau des Brevets et Inventions 
26-28, rue du Docteur Roux 
75016 PARIS 




1 - 2339 

Date du depot ou du transfert ^ : 

26 OCTOBRE 1999 


III, 


DECLARATION SUR LA VIABILITE 






La viabilite du micro-organlsme identifie sous chlffre IX a ete contirdlee 
le 27 OCTOBRE 1999 ^* a cette date, le mlcro-organisme 


■ 


3 

etait viable 






□ 


3 

n'etalt plus viable 







1 Itidiquer la date du depot initial ou, si un nouveau depot ou uix transfert ont ete 
effectues, la plus r^cente des dates pertinentes (date du nouveau depot ou date du transfert). 

2 Dans les cas vises k la regie 10,2.a)ii) et iil) , mentionner le controle de viabilite 
le plus recent. 



3 Cocher la case qui convient. 



Eormule BP/9 (premiere page) 
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IV. CONDITIONS DANS LESQUELLES LE CONTROLE DI VIABILITE A ETE EFFECTUE 



AUTORITE DE DEPOT INTERNATIONALE 



Nom 



Adresse : 



CNCM 



Collection Natlonale 

de Cultures de Microorganismes 



rNSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 
FRANCE 



SiguatureCs) de la (des) personneis) 
competente (s) pour repr^setiter I'autorit^ 
de depot ititernationale ou de l*(des) 
einploye(s) autorlse(s) : 



SimonaOZDEN 

Directeuf^ie ta CNCM 




Georges WAGENER 

Conseiller Scientifique de la CNCM 
pouc^ bacterles 



Date : 



Paris, le ^ novernpre 1 




A A remplir si cette information a ete demandee 
negatif s . 



at si les resultats du contrdle etaient 



Formule BP/9 (deuxifeme et derniere page) 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 



INTERNATIONAL SYSTEM 



RECIPIENTS: 




INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75015 PARIS 


RECEIPT FOR INITIAL DEPOSIT, 
issued in accordance with rule 7.1 by the 
INTERNATIONAL DEPOSIT AUTHORITY 
identified at the bottom of this page 


NAME AND ADDRESS OF 
DEPOSITOR 




1. IDENTIFICATION OF THE MICROORGANISM 


Identification reference given by the 
DEPOSITOR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 


P5479 


1-2339 



il. SCIENTIFIC DESCRIPTION AND/OR PROPOSED TAXONOMIC DESIGNATION 



The microorganism identified under heading I was accompanied: 



B By a scientific description 



B By a proposed taxonomic description 
(Check the appropriate box) 



III RECEIPT AND ACCEPTANCE 

The present International Deposit Authority accepts the microorganism identified under heading 1, which it 
received on October 26, 1999 (date of the initial deposit) 



IV. RECEI PT OF A REQUEST FOR CONVERSION 

The present International Deposit Authority received the microorganism identified under heading I, which it 
received on (^^t^ of 'i^'t'^' deposit) 

and received a conversion request of the initial deposit into a deposit which conforms to the Budapest Treaty on 

(date of the receipt of conversion request) 



V. INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationale de Cultures 
De Microorganismes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-75724 PARIS CEDEX 16 



Signature(s) of the person(s) competent to represent the 
International Deposit Authority or the authorized 
employee(s): Simona OZDEN 



Director of CNCM 



[signature] 
Date : Paris, November 30, 1999 



1. 



In the case of application of rule 6.4. d). this date is the date on which the authorizing statute for 
international deposit was acquired. 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 

INTERNATIONAL SYSTEM 

RECIPIENTS: 

INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75015 PARIS 

NAME AND ADDRESS OF 
DEPOSITOR 



1 - Depositor 


II. Identification of the microorganism 


Name : INSTITUT PASTEUR 

Address: Bureau des Brevets et inventions 
25-28 rue du Docteur Roux 
76015 PARIS 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 

1-2339 

Date of the deposit^ : 

OCTOBER 26, 1999 


IL DECLARATION ON THE VIABILITY 


The viability of the microorganism identified under heading II was controlled 
On OCTOBER 27, 1999 ^ At this date the microorganism 


B was viable^ 




n was no more viable ^ 





: Indicate the initial date of the deposit or, if a new deposit or a transfert has been done, the most recent of 
the relevant dates 

^" : In the cases referred to in Rule 10.2a)ii) and iii), mention the most recent viability control. 
^' : Tick the appropriate box 



DECLARATION ON VIABILITY 
issued in accordance with rule 10.2 y the 
INTERNATIONAL DEPOSIT AUTHORITY 
identified on the following page 



IV. CONDITIONS OF THE VIABILITY CONTROL^ 



V. INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationals de Cultures 
De Microorganismes 



Address : INSTiTUT PASTEUR 

28. rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) of the person(s) competent to represent the 
International Deposit Authority or the authorized 
empioyee(s): 



Yvanne CERISIER 

Administrative CNCM Manager 

[signature] 



Georges WAGENER 

CNCM Scientific Adviser for Bacteria 

[signature] 



Date : Paris, November 30, 1999 



only fill this part when the control is negative 



TRAITE BE BUDAPEST SUR LA RECONNAISSANCE 
INTERNATIONALE DU DEPOT DES HICRO-ORGANISMES 
AUX FINS DE LA PROCEDURE EN MATIERE DE BREVETS 



lO'RMULE INTERNATIONALE 



pDE 



DESTINATAIRE 



INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75015 PARIS 



RECEPISSE EN CAS DE DEPOT INITIAL, 
delivre en vertu de la r^gle 7.1 par 
I'AUTORITE DE DEPOT INTERNATIONALE 
identiflee au has de cetite page 



I NOM ET ADRESSE I 
I DEPOSANT 



I. 



IDENTIFICATION DU MICRO -ORGAN IS ME 



Reference 
DEPOSANT 


d ' identification donn^e par le 


Numero d'ordre attrlbue par 
I'AUTORITE DE DEPOT INTERNATIONALE : 




P8144 


1 - 2026 



II. DESCRIPTION SCIENTIEIQUE ET/OU DESIGNATION TAXOHOMIQUE PROPOSEE 



Le micro-orgauisme identifie sous chlffre I etait accompagne 
d*une description scientifique 

d*une designation taxonomlque propos^e 



(Cocher ce qui convient) 



III. RECEPTION ET ACCEPTATION 



La presente autorit^ de depot internationale accepte le mlcro-organlsme identifi^ sous 



chiffre I, qii*elle a regu le 25 MAI 1998 



(date du d^pdt initial) 



IV, RECEPTION D'UNE REQUETE EN CONVERSION 



La presente autorit^ de depot internationale a re^u le micro-organisme identifie sous 
chiffre I le (date du depot initial) 

et a re9u une requite en conversion du d^pot initial en depot conforme au Traite de 
Budapest le (date de reception de la requite en conversion) 



AUTORITE DE DEPOT INTERNATIONALE 



Nom 



Adresse ; 



CNCM 

Collection Nationale 

de Cultures de Microorganismes 

INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-76724 PARIS CEDEX 16 



Slgnature(s) de la (des) personne(s) 
competente ( s ) pour representor I'autorite 
de d€pdt Internationale ou de l'(des) 
employe (s) autorisg(s) s Mme Y, CERISIER 



Difecteur Administratif de la CNCM 



Date : Paris, le 09 juin 19^ 



En cas d * application de la rSgle 6.4.d), cette date 
d'autorlte de depot Internationale a ete acquis. 



est la date a laquelle le statut 



Formule BP/4 (page unique) 



TRAITE DE BUDAPEST SUR LA RECONNAISSANCE 
INTERNATIONALE DU DEPOT DES MICRO-ORGANISMES 
AUX FINS DE LA PROCEDURE EN MATIERE DE BREVETS 



EORMULE INTERNATIONALE 



I DE 



DESTINATAIRE 



Madame D. BERNEMAN, 
Bureau des Brevets et Inventions 
INSTiTUT PASTEUR 
25-28, rue du Docteur Roux 
75724 PARIS CEDEX 15 



DECLARATION SUR LA VIABILITE, 
dellvr^e en vertu de La regie 10.2 par 
I'AUTORITE DE DEPOT INTERNATIONALE 
identifiee a la page sulvaate 



NOM ET ADRESSE DE LA PARTIE 
A LAQUELLE LA DECLARATION SUR LA 
VIABILITE EST DELIVREE 



I. 


DEPOSANT 


11. IDENTIEICATIOH DU MICRO-ORGANISME 


Nom 


INSTITUT PASTEUR 


Numero d'ordre attribue par 
I'AUTORITE DE DEPOT INTERNATIONALE : 


Adre. 


: Bureau des Brevets et Inventions 
25-28 rue du Docteur Roux 
75015 PARIS 


1 - 2026 

Date du d^pdt ou du transfert ^ : 

25 MAI 1998 


III. 


DECLARATION SUR LA VIABILITE 




La viabilite du micro-organlsme identlfl^ sotis chiffre II a €t:e controlee 
le 26 MAI 1998 ^* ^ cette date, le micro-organisme 




etait viable 






n'etalt: plus viable 





1 Indiquer la date du depot initial ou, si un nouveau depot ou un transfert ont ete 
effectues, la plus r^cente des dates pertlneiites (date du nouveau depot ou date du transfert). 

2 Dans les cas vis4s la regie i0.2.a)ii) et ill), mentlonner le controle de viabilite 
le plus recent . 



3 Cocber la case qui convient. 



lormule BP/9 (premiere page) 



4 

IV. CONDITIONS DANS LESQUELLES LE CONTROLE DE VIABILITE A ETE EFFECTUE 



V. AUTORITE DE DEPOT INTERNATIONALE 



Nom : 


CNCM 

Collection Nationale 

de Cultures de Microorganismes 


SignatureCs) de la (des) personne(s) 
competente (s) pour reprcsenter I'autorite 
de dgpot Internationale on de 1' (des) 
employe<s) autorise(s) ; 


Adresse : 


INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 
FRANCE 


Yvanne CERISIER Georges WAGENER 

Directeur admin tstratif de la CNCM Conseiiler Scientifique de ta CNCM 

Date : Paris, le 09 juin I^TO 



4 A remplir si cette Information a ^t^ demandee et si les resultats du controle ^talent 
tiegatlf s . 



Formule BP/9 (deuxifeme et derniere page) 



BUDAPEST TREATY ON THE, INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 



INTERNATIONAL SYSTEM 



RECIPIENTS: 

INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75015 PARIS 

NAME AND ADDRESS OF 
DEPOSITOR 



RECEIPT FOR INITIAL DEPOSIT, 
issued in accordance with rule 7.1 by the 
INTERNATIONAL DEPOSIT AUTHORITY 
identified at the bottom of this page 



I. IDENTIFICATION OF THE MICROORGANISM 



Identification reference given by the 
DEPOSITOR 

p8144 



Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 

1-2026 



II. SCIENTIFIC DESCRIPTION AND/OR PROPOSED TAXONOMIC DESIGNATION 



The microorganism identified under heading I was accompanied: 
H By a scientific description 

H By a proposed taxonomic description 
(Check the appropriate box) 



III. RECEIPT A ND ACCEPTANCE - 

The oresent International Deposit Authority accepts the microorganism identified under heading I, which it 
received on May 25, 1998 (date of the initial deposit) 



IV. RECEIPT OF A REQUEST FOR CONVERSION 



The oresent International Deposit Authority received the microorganism identified under headirig I, which it 
received on (date of the initial deposit) 

and received a conversion request of the initial deposit into a deposit which conforms to the Budapest Treaty on 

(date of the receipt of conversion request) 



V. INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationa!e de Cultures 
De Microorganismes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) of the person(s) competent to represent the 
International Deposit Authority or the authorized 
employee(s): Mme Y, CERISIER 

Administrative director of CNCM 

[signature] 
Date : Paris, June 9, 1998 



1. 



In the case of application of rule 6.4.d), this date is the date on which the authorizing statute for 
international deposit was acquired. 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 

INTERNATIONAL SYSTEM 



RECIPIENTS: 




INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75015 PARIS 


DECLARATION ON VIABILITY 

issued in accordance with rule 10.2 y the 

IN 1 tKNA 1 IvJiNAL UtKwOi 1 r\U 1 riLJrvl I T 

identified on the following page 


NAME AND ADDRESS OF 
DEPOSITOR 




1 ™ Depositor 


IL Identification of the microorganism 


Name : INSTITUT PASTEUR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 


Address: Bureau des Brevets et inventions 
26-28 rue du Docteur Roux 
75015 PARIS 


1-2026 

Date of the deposit'' : 

MAY 25, 1998 


li. DECLARATION ON THE VIABILITY 


The viability of the microorganism identified under heading II was controlled 
On MAY 26, 1998 ^ At this date the microorganism 


B was viable^ 




1 1 was no more viable ^ 





Indicate the initial date of the deposit or, if a new deposit or a transfert has been done, the most recent of 
the relevant dates 

In the cases referred to in Rule 10.2a)ii) and iii), mention the most recent viability control. 
^' : Tick the appropriate box 



IV. CONDITIONS OF THE VIABILITY CONTROL^ 



V. INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationale de Cultures 
De Microorganismes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) of the person(s) competent to represent the 
International Deposit Authority or the authorized 
empioyee(s): 



Yvanne CERISIER 

Administrative CNCIV! Manager 

[signature] 



Date : Paris, June 9, 1998 



Georges WAGENER 

CNCM Scientific Adviser for Bacteria 

[signature] 



only fill this part when the control is negative 



TRAITE DE BUDAPEST SUK LA RECONNAISSANCE 
INTERNATIONALE DU DEPOT DES MICRO-ORGAHISMES 
AUX FINS DE LA PROCEDURE EN MATIERE DE BREVETS 



EORMULE INTERNATIONALE 



DESTINATAIRE : | 

INSTITUT PASTEUR 
Bureau des Brevets et inventions 
25-28, rue du Docteur Roux 
75015 PARIS 



RECEPISSE EN CAS DE DEPOT INITIAL, 
delivre en vertu de la regie 7.1 par 
1»AUT0RITE DE DEPOT INTERNATIONALE 
Identlfl^e au bas de cette page 



L 



NOM ET ADRESSE 
DU DEPOSANT 



I. IDENTIFICATION DU MICRO-ORGANISME 


Reference d ' identification donnee par le 
DEPOSANT : 

P5366 


Numero d'ordre attribu^ par 
I'AUTORITE DE DEPOT INTERNATIONALE : 

1 - 2025 



II. DESCRIPTION SCIENTIFIQUE ET/OU DESIGNATION TAXONOMIQUE PROPOSES 



Le ralcro-organisme identifie sous chiffre I etait accompagne 
d*une description scientifique 

d*une designation taxonomique proposes 



(Cocher ce qui convient) 



III. RECEPTION ET ACCEPTATION 



La presente autorite de depot Internationale accepte le micro-organlsme identifie sous 
chiffre I, qu'elle a re^u le 25 MAI 1998 (date du depot initial)^ 



IV. RECEPTION D'UNE REQUETE EN CONVERSION 



La presente autorite de d^p6t Internationale a re^u le taicro-organisme identifie sous 
chiffre I le (date du depot initial) 

et a re^u une requete en conversion du dgp6t initial en depot conforme au Traite de 
Budapest le (date de reception de la requete en conversion) 



V. AUTORITE DE DEPOT INTERNATIONALE 



Nom : 


CNCM 

Collection Nationaie 

de Cultures de Microorganismes 




Signature(s) de la (des) personne(s) 






competente ( s ) pour repr^senter 1* autorite 

de depot Internationale ou de l*(des) 

employs (s) autorise(s) : Mme Y. CERISIER 

Directeur Admintstratif de la CNCM 


Adresse 


INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 




Date : Paris Je 09 juin 1998 


X En ca 


s d ' application de la r^gle 6.4.d), 


cette 


date est la date a laquelle le statut 


d* autorite de depot Internationale a et4 


acqui 


s . 


Formule 


BP/ 4 (page unique) 







TRAITE DE BUDAPEST SUR LA RECONNAISSANCE 
INTERNATIONALE DU DEPOT DES MICRO-ORGANISMES 
AUX FINS DE LA PROCEDURE EN MATIERE DE BREVETS 



FORMULE INTERNATIONALE 



DESTINATAIRE 



Madame D. BERNEMAN, 
Bureau des Brevets et inventions 
INSTITUT PASTEUR 
25-28, rue du Docteur Roux 
75724 PARIS CEDEX 15 



DECLARATION SUR LA VIABILITE, 
d^livree en vertti de la regie 10.2 par 
I'AUTORITE DE DEPOT INTERNATIONALE 
identifiee a la page stilvante 



NOM ET ADRESSE DE LA PARTIE 
I A LAQUELLE LA DECLARATION SUR LA I 
I VIABILITE EST DELIVREE I 



I. DEPOSANT 


II. IDENTIFICATION DU MICRO-ORGANISME 


Noin : INSTITUT PASTEUR 


Numero d'ordre attribue par 
1»AUT0RITE DE DEPOT INTERNATIONALE : 


Adresse : Bureau des Brevets et Inventions 
25-28 rue du Docteur Roux 
75015 PARIS 


1 - 2025 

Date du depot ou du transfert ^ : 

25 MAI 1998 


III. DECLARATION SUR LA VIABILITE 


La viabllite du mlcro-organisme identifl^ sous chiffre II a et^ contrdl^e 
le 26 MAI 1998 ^« ^ cette date, le micro -organistne 


^3 

nil Stait viable 




^3 

j j n'etait plus viable 





1 Indlquer la date du depot initial ou, si nn nouveau dSpot ou un transfert ont ^te 
effectues, la plus recente des dates pertinentes (date du nouveau dep6t ou date du transfert), 

2 Dans les cas vises Sl la regie 10.2.a)ii) et ili) , mentionner le controle de viabilite 
le plus recent. 



3 Cocher la case qui convient. 



Formule BP/9 (premiere page) 



4 

IV. CONDITIONS DANS LESQUELLES LE CONTROLE DE VIABILITE A KTE EFFECTUE 



V. A0TOEITE DE DEPOT INTERNATIONALE 



Nota : 



Adresse : 



CNCM 

Collection Nationale 

de Cultures de Microorganismes 



INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 
FRANCE 



Signature ( s ) de la (des) persoiine(s) 
comp^tente ( s ) pour represetiter I'autorite 
de depot interaat lonale ou de l'(des) 
employees) autorlse(s) ! 



Yvanne CERISIER 

Directeur administratif de la CNCM 



Georges WAGENER 



er Scientifique de !a CNCM 
3 bacteries 



Date 5 



Paris, le OSJuIn 1 




4 A remplir si cette information a et^ demandee et si les r^sultats du controle etaient 
negatifs. 



Formule BP/9 (deuxieme et derniere page) 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 



INTERNATIONAL SYSTEM 

RECIPIENTS: 

INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75015 PARIS 

NAME AND ADDRESS OF 
DEPOSITOR 



I. IDENTIFICATION OF THE MICROORGANISM 



Identification reference given by the 
DEPOSITOR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 




135366 


1-2025 





11. SCIENTIFIC DESCRIPTION AND/OR PROPOSED TAXONOMIC DESIGNATION 



The microorganism identified under heading I was accompanied: 



H By a scientific description 



B By a proposed taxonomic description 
(Check the appropriate box) 



Hi. RECEI PT AND ACCEPTANCE 

The present International Deposit Authority accepts the microorganism identified under heading I, which it 
received on May 25, 1998 (date of the initial deposit) 



RECEIPT FOR INITIAL DEPOSIT, 
issued in accordance with rule 7.1 by the 
INTERNATIONAL DEPOSIT AUTHORITY 
identified at the bottom of this page 



IV, RECEIPT OF A REQUEST FOR CONVERSION 

The present International Deposit Authority received the microorganism identified under heading I, which it 
received on {dale of the initial deposit) 

and received a conversion request of the initial deposit into a deposit which conforms to the Budapest Treaty on 

(date of the receipt of conversion request) 



V. INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationale de Cultures 
De Microorgantsmes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-76724 PARIS CEDEX 16 



Signature(s) of the person(s) competent to represent the 
International Deposit Authority or the authorized 
employee(s): Mme Y. CERISIER 

Administrative director of CNCM 

[signature] 
Date : Paris, June 9, 1998 



1 [In the case of application of rule 6,4. d), this date is the date on which the authorizing statute for 
pnternationai deposit was acquired. 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 

INTERNATIONAL SYSTEM 



RECIPIENTS: 




INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
76015 PARIS 


DECLARATION ON VIABILITY 

issued in accordance with rule 10.2 y the 

INTERNA! lUNAL UtiPUoi 1 AU 1 nvJKI 1 Y 

identified on the following page 


NAME AND ADDRESS OF 
DEPOSITOR 




i - Depositor 


II. Identification of the microorganism 


Name : INSTITUT PASTEUR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 


Address: Bureau des Brevets et inventions 
26-28 rue du Docteur Roux 
75016 PARIS 


1-2026 

Date of the deposit^ : 

MAY 25, 1998 


II. DECLARATION ON THE VIABILITY 


The viability of the microorganism identified under heading II was controlled 
On MAY 26, 1998 ^ At this date the microorganism 


H was viable^ 




1 1 was no more viable ^ 





Indicate the initial date of the deposit or, if a new deposit or a transfert has been done, the most recent of 
the relevant dates 

In the cases referred to in Rule 10.2a)ii) and iii), mention the most recent viability control. 
^' : Tick the appropriate box 



IV. CONDITIONS OF THE VIABILITY CONTROL'' 



V. INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationale de Cultures 
De Microorganismes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) of the person(s) competent to represent the 
International Deposit Authority or the authorized 
employee(s): 



Yvanne CERISIER 

Administrative CNCM Manager 

[signature] 



Date : Paris, June 9, 1998 



Georges WAGENER 

CNCM Scientific Adviser for Bacteria 

[signature] 



only fill this part when the control is negative 



TRAITE DE BUDAPEST SUE LA RECONNAISSANCE 
INTERNATIONALE DU DEPOT DES MICRO-ORGANISMES 
AUX riNS DE LA PROCEDURE EN MATIERE DE BREVETS 



FORMULE INTERNATIONALE 



DESTINATAIRE 



Madame D. BERNEMAN, 
Bureau des Brevets ef Inventions 
INSTITUT PASTEUR 
25-28, rue du Oocteur Roux 
76724 PARIS CEDEX 15 



DECLARATION SUR LA VIABILITE, 
dellvr^e en verttx de la regie 10.2 par 
1»AUT0RITE DE DEPOT INTERNATIONALE 
Identifiee k la page siiivante 



NOM ET ADRESSE DE LA PARTIE 
I A LAQUELLE LA DECLARATION SUR LA I 
! VIABILITE EST DELIVREE I 



I. DEPOSANT 


11. IDENTIFICATION DU MICRO-ORGANISME 


Norn : INSTITUT PASTEUR 




Numero d'ordre attrlbue par 
I'AUTORITE DE DEPOT INTERNATIONALE : 


Adresse : Buredu des Brevets et Inventions 
25-28 rue du Docteur Roux 
75015 PARIS 




1 - 2027 

Date du depot ou du transfert ^ t 

25 MAI 1998 


III. DECLARATION SUR LA VIABILITE 


La viabilite du micro-organlsme identifle sous chiffre 11 a 6tg controlee 
le 26 MAI 1998 ^* a cette date, le micro-organisme 


^3 

^talt viable 






j j n'etait plus viable 







1 Indiquer la date du depot Initial ou, si un nouveau dep6t ou un transfert ont 4te 
effectues, la plus r^cente des dates pertinentes (date du nouveau dep6t ou date du transfert). 

2 Dans les cas vises a la regie 10.2,a)ii) et ill), mentlonner le controle de viability 
le plus recent. 



3 Cocher la case qui convient. 



Formule BP/9 (premiere page) 
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IV, CONDITIONS DANS LESQUELLES LE CONTROLE DE VIABILITE A ETE EFFECCUE 



V. 



AOTORITE DE DEPOT INTERNATIONALE 



Nom 



Adresse 



CNCM 

Coltection Nationale 

de Cultures de Microorganismes 



INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS GEDEX 15 
FRANCE 



Signature ( s ) de la (des) personne(s) 
competente ( s ) pour representer I'autorite 
de d^pot Internationale ou de l'{des) 
employe<s) autorise(s) : 



Yvanne CERISIER 

Dffecteur adminfstratif de la CNCM 



Georges WAGENER 

Oonseilter Sctentifique de ta CNCM 
PQkrrips bacteries 



Date 



Paris, Ie09juln 1 




4 A remplir si cette information a etS demandee et si les resultats du controle etalent 
negatlfs. 



Formula BP/9 (deuxieme et dernlere page) 



TRAITE DE BUDAPEST SUK LA RECONNAISSANCE 
INTEKNATIONALE 0U DEPOT DES MICRO-ORGANISMES 
AUX FINS DE LA PROCEDURE EN MATIERE DE BREVETS 



FORMULE INTERNATIONALE 



DESTINATAIRE 



INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75016 PARIS 



RECEFXSSE EN CAS DE DEPOT INITIAL, 
delivre en vertu de la rdgle 7.1 par 
1»AUT0RITE DE DEPOT INTERNATIONALE 
identifiee au bas de cette page 



HOM ET ADRESSE 
DU DEPOSANT 



I. 



IDENTIFICATION DU MICRO-ORGANISME 



Reference d * identification donn^e par le 
DEPOSANT : 

P8146 


Numero d*ordre attribiie par 
I'AUTORITE DE DEPOT INTERNATIONALE : 

1 - 2027 


11. DESCRIPTION SGIENTIFIQUE ET/OU DESIGNATION TAXONOMIQUE PROPOSEE 



Le micro-organisme Identifle sous chiffre I etait accompagn^ 
d'une description scientifique 

d'une designation taxonomique propos4e 



(Cocher ce qui convlent) 



III. RECEPTION ET ACCEPTATION 



La presente autorite de depot Internationale accepte le mlcro-organisme identifle sous 
chiffre I, qu*elle a re9u le 25 MAM 998 (date du d^pSt initial)^ 



IV. RECEPTION D'UNE REQUETE EN CONVERSION 



La presente autorit^ de dep6t Internationale a regu le micro-organisme Identifle sous 
chiffre I le (date du depot initial) 

et a re9U une requete en conversion du depot initial en depot conforme au Tralte de 
Budapest le (date de reception de la requete en conversion) 



V. AUTORITE DE DEPOT INTERNATIONALE 



Nom 



Adresse 



CNCM 



Collection Nationale 

de Cultures de Microorganismes 

INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) de la (des) personne(s) 
competente ( s) pour repr^senter I'autorite 
de depdt Internationale ou de 1* (des) 

eraploye(s) autorise(s) : Mme Y. CERISIER 

Directeur Administratif de !a CNCM 



Date : 



Paris, le 09 juin 1998 



En cas d ' application de la regie 6,4.d), cette date est la date a laquelle le statut 
d'autorite de depot Internationale a ete acquis. 



Formule BP/A (page unique) 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 



INTERNATIONAL SYSTEM 



RECIPIENTS: 




INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75015 PARIS 


RECEIPT FOR INITIAL DEPOSIT, 
issued in accordance with rule 7.1 by the 
INTERNATIONAL DEPOSIT AUTHORITY 
identified at the bottom of this page 


NAME AND ADDRESS OF 
DEPOSITOR 




1. IDENTIFICATION OF THE MICROORGANISM 


Identification reference given by the 
DEPOSITOR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 


P8146 


1-2027 



II. SCiENTIFIC DESCRIPTION AND/OR PROPOSED TAXONOMIC DESIGNATION 



The microorganism identified under heading I was accompanied: 

M By a scientific description 

Bi By a proposed taxonomic description 
(Check the appropriate box) 



RECEIPT AND ACCEPTANCE 



The present International Deposit Authority accepts the microorganism identified under heading i, which it 
received on May 25, 1998 (date of the initial deposit) 



IV. RECEIPT OF A REQUEST FOR CONVERSION 



The present International Deposit Authority received the microorganism identified under heading I, which it 
received on (date of the initial deposit) 

and received a conversion request of the initial deposit into a deposit which conforms to the Budapest Treaty on 

(date of the receipt of conversion request) 



V. INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationale de Cultures 
De Microorganismes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



S!gnature(s) of the person(s) competent to represent the 
Internationa! Deposit Authority or the authorized 
employee(s): Mme Y. CERISIER 



Administrative director of CNCM 



[signature] 
Date : Paris, June 9, 1998 



1 



In the case of application of rule 6.4.d), this date is the date on which the authorizing statute for 
international deposit was acquired. 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 

INTERNATIONAL SYSTEM 



RECIPIENTS: 




INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75015 PARrS 


DECLARATION ON VIABILITY 

issued in accordance with rule 10.2 y the 

IN i tlKINAI ivJINAL Utr'wol 1 MU 1 rHwrxl 1 T 

identified on the following page 


NAME AND ADDRESS OF 
DEPOSITOR 




1 - Depositor 


11. Identification of the microorganism 


Name : INSTITUT PASTEUR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 


Address: Bureau des Brevets et inventions 
25-28 rue du Docteur Roux 
75015 PARIS 


1-2027 

Date of the deposit^ : 

MAY 25, 1998 


li. DECLARATION ON THE VIABILITY 


The viability of the microorganism identified under heading 1! was controlled 
On MAY 26, 1998 ^ At this date the microorganism 


B was viable^ 




n was no more viable ^ 





^' : indicate the initial date of the deposit or, if a new deposit or a transfert has been done, the most recent of 
the relevant dates 

^" : In the cases referred to in Rule 10.2a)ii) and iii), mention the most recent viability control, 
^ : Tick the appropriate box 



IV. CONDITIONS OF THE VIABILITY CONTROL^ 



V, INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationals de Cultures 
De Microorganismes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) of the person(s) connpetent to represent the 
International Deposit Authority or the authorized 
emp!oyee(s): 



Yvanne CERISIER 

Administrative CNCM Manager 

[signature] 



Date : Paris, June 9, 1998 



Georges WAGENER 

CNCM Scientific Adviser for Bacteria 

[signature] 



only fill this part when the control is negative 



TRAITE DE BUDAPEST SUR LA RECOHNAISSANCE 
INTERNATIONALE DU DEPOT DES MICRO-ORGANISMES 
AUX FINS DE LA PROCEDURE EN MATIERE DE BREVETS 



EORHULE INTERNATIONALE 



DESTINATAIRE 



INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
75015 PARIS 



RECEPISSE EN CAS DE DEPOT INITIAL, 
delivre en vertu de la regie 7,1 par 
I'AUTORITE DE DEPOT INTERNATIONALE 
Identlfiee au has de cette page 



NOM ET ADRESSE 
DU DEPOSANT 



1. IDENTIFICATION DU MICRO-ORGANISME 


Reference d ' identification donnee par le 
DEPOSANT : 

P5486 


Num^ro d'ordre attribue par 
I'AUTORITE DE DEPOT INTERNATIONALE : 

1-2341 



II. DESCRIPTION SCIENTIFIQUE ET/OU DESIGNATION TAXONOHIQUE PROPOSEE 



Le micro-organisme identifie sous chiffre I etait accompagne 
d*une description scientifique 
d*une designation taxonomique proposee 



(Coclier ce qui convient)- 



III. RECEPTION ET ACCEPTATION 



La pr^sente autorite de depot internationale accepte le micro-organisme identifie sous 
chiffre I, qu'elle a regu le 26 OCTOBRE 1999 (date du depot initial) ^ 



IV. RECEPTION D'UNE REQUETE EN CONVERSION 



La presente autorite de dep6t internationale a re^u le micro-organisme identifie sous 
chiffre I le (date du d^pot Initial) 

et a re9u une requete en conversion du depot initial en depot conforme au Tralte de 
Budapest le (date de reception de la requete en conversion) 



V. AUTORITE DE DEPOT INTERNATIONALE 



Nom 



Adre 



CNCM 

Coltection Nationale 

de Cultures de Microorganismes 

INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature (s) de la (des) personne(s) 
comp4tente(s) pour repr^senter I'autorite 
de dep5t internationale ou de l*(des) 

employe(s) autorise(s) : Simoha 07DEN 

' de la ^CM 



Date I Paris, le 30 novennbre 1, 




1 En cas d * application de la r^gle 6.4.d), cette date 
d*autorite de depot internationale a ^te acquis. 



est la date a laquelle le statut 



Formule BP/4 (page unique) 



TRAITE DE BUDAPEST SUR LA RECONNAISSANCE 
INTERNATIONALE DU DEPOT DES MIGRO-ORGANISMES 
AUX yiNS DE LA PROCEDURE EN MATIERE DE BREVETS 



FORMULE INTERNATIONALE 



DESTINATAIRE : 

Madame Danielle BERNEMAN, 
Bureau des Brevets et Inventions 
INSTITUT PASTEUR 
26-28, rue du Docteur Roux 
75724 PARIS CEDEX 15 



DECLARATION SUR LA VIABILITE, 
delivree en vertu de la regie 10,2 par 
I'AUTORITE DE DEPOT INTERNATIONALE 
identiflee a la page suivante 



NOM ET ADRESSE DE LA PARTIE 
A LAQUELLE LA DECLARATION SUR LA 
VIABILITE EST DELIVREE 



I. DEPOSANT 


II. IDENTIFICATION DU MICRO-ORGANISME 


Nom : INSTITUT PASTEUR 

Adresse s Bureau des Brevets et Inventlons 
26-28, rue du Docteur Roux 
76016 PARIS 


Numero d*ordre attribue par 
I'AUTORITE DE DEPOT INTERNATIONALE : 

1 - 2341 

Date du depot ou du transfert ^ : 

26 OCTOBRE 1999 



III. DECLARATION SUR LA VIABILITE 



La viability du micro-organisme identifle sous chlffre II a ete controlee 
le 27 OCTOBRE 1999 ^' ^ cette date, le micro-organisme 




etalt viable 



n*etalt plus viable 



1 Indlquer la date du dcp6t initial ou, si un nouveau depot ou un transfert ont ete 
effectues, la plus recente des dates pertinentes (date du nouveau depot ou date du transfert). 

2 Dans les cas vises 4 la regie 10.2.a)ii) et iii) , mentlonner le contrdle de viabilite 
le plus recent . 



3 Cocher la case qui convient. 
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4 

IV. CONDITIONS DANS LESQUELLES LE CONTROLE BE VIABILITE A ETE EFFECTUE 



V. AUTORITE DE DEPOT INTERNATIONALE 



Nom ; 



Adresse 



CNCM 

Colfection Nattonale 

de Cultures de Microorganismes 



INSTITUT PASTEUR 

28, Rue du Docteur Roux 
F-75724 PARIS CEDEX 15 
FRANCE 



Signature(s) de la (des) persoane(s) 
comp^tent:e(s) pour represeater I*autorite 
de depot intiertxat ionale ou de l'(des) 
employe(s) autorise(s) : 



^mona OZDEN 

Directeuc^e ia CNCM 




Georges WAGENER 

Cons^aHter Scientifique de la CNCM 
poLff !es qacteries 



Paris, le 30 novei 




4 A retiiplir si cette information a ete demandee et si les resultats du controle etaient 
n^gatifs. 
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BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 



INTERNATIONAL SYSTEM 



RECIPIENTS: 




INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
76016 PARIS 


RECEIPT FOR INITIAL DEPOSIT, 
issued in accordance with rule 7.1 by the 
INTERNATIONAL DEPOSIT AUTHORITY 
identified at the bottom of this page 


NAME AND ADDRESS OF 
DEPOSITOR 




1 IDENTIFICATION OF THE MICROORGANISM 


Identification reference given by the 
DEPOSITOR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 


P5486 


1-2341 



11. SCIENTIFIC DESCRIPTION AND/OR PROPOSED TAXONOMIC DESIGNATION 



The microorganism identified under heading I was accompanied: 



H By a scientific description 



B By a proposed taxonomic description 
(Check the appropriate box) 



ML RECEIPT AND ACCEPTANCE 

The present International Deposit Authority accepts the microorganism identified under heading I, which it 
received on October 26, 1999 (date of the initial deposit) 



IV. RECE IPT OF A REQUEST FOR CONVERSION 

Xhe present International Deposit Authority received the microorganism identified under heading I, which it 
received on (date of the initial deposit) 

and received a conversion request of the initial deposit into a deposit which conforms to the Budapest Treaty on 

(date of the receipt of conversion request) 



V. INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationale de Cultures 
De Microorganismes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-75724 PARIS CEDEX 15 



Signature(s) of the person(s) competent to represent the 
international Deposit Authority or the authorized 
employee(s): Simona OZDEN 

Director of CNCM 

[signature] 
Date : Paris, November 30, 1999 



In the case of application of rule 6.4.d), 
international deposit was acquired. 



this date is the date on which the authorizing statute for 



BUDAPEST TREATY ON THE INTERNATIONAL 
RECOGNITION OF THE DEPOSIT OF MICROORANISMS 
FOR THE PURPOSES OF PATENT PROCEDURE 



INTERNATIONAL SYSTEM 







INSTITUT PASTEUR 
Bureau des Brevets et Inventions 
25-28, rue du Docteur Roux 
76015 PARIS 


DECLARATION ON VIABILITY 
issued in accordance with rule 10.2 y the 
INTERNATIONAL DEPOSIT AUTHORITY 
identified on the following page 


NAME AND ADDRESS OF 
DEPOSITOR 




1 " Depositor 


IL Identification of the microorganism 


Name : INSTITUT PASTEUR 


Serial number given by the 
INTERNATIONAL DEPOSIT AUTHORITY 


Address: Bureau des Brevets et inventions 
25-28 rue du Docteur Roux 
75015 PARIS 


1-2341 

Date of the deposit^ : 

OCTOBER 26, 1999 


11. DECLARATION ON THE VIABILITY 


The viability of the microorganism identified under heading II was controlled 
On OCTOBER 27, 1999 ^ At this date the microorganism 


Bi was viable^ 




CU was no more viable ^ 





^" : Indicate the initial date of the deposit or, if a new deposit or a transfert has been done, the most recent of 
the relevant dates 

In the cases referred to in Rule 10.2a)ii) and iii), mention the most recent viability control. 
^" : Tick the appropriate box 



IV. CONDITIONS OF THE VIABILITY CONTROL'* 



INTERNATIONAL DEPOSIT AUTHORITY 



Name : CNCM 

Collection Nationale de Cultures 
De Microorganismes 



Address : INSTITUT PASTEUR 

28, rue du Docteur Roux 
F-76724 PARIS CEDEX 15 



Signature(s) of the person(s) competent to represent the 
International Deposit Authority or the authorized 
employee(s): 



Yvanne CERISIER 

Administrative CNCM Manager 

[signature] 



Georges WAGENER 

CNCM Scientific Adviser for Bacteria 

[signature] 



Date : Paris, November 30, 1999 



^' only fill this part when the control is negative 
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The VASI gene encoding the valyl-tRNA synthetase 
from yeast was i^lated and sequenced. The gene-de- 
rived amino acid sequence of yeast valyl-tKN A synthe- 
tase was found to foe 23% homologous to the Esche- 
richia coli isoIeucyl-tKNA synthetase. This is the high- 
est level of homology reported so far between two 
distinct aminoacyl»tRNA synthetases and is indicative 
of an evolutionary relationship between these two mol- 
ecules. Within these homologous sequences, two func- 
tional regions could be recognized: the HIGH region 
which forms part of the binding site of ATP and the 
KMSKS region which Is recognized as the consensus 
sequence for the binding of the 3 '-end of tRNA (Houn- 
tondji, Dessen, Ph., and Blanquet, S. (1986) Bio- 
chemie (Paris) 68, 1071-1078). Secondary structure 
predictions as well as the presence of both HIGH and 
KMSKS regions, delineating the nucleotide-binding 
domain and the COOH-terminal helical domain in ami- 
noacyl-tRNA synthetases of known three-dimensional 
structure, sugg^t that the yeast valyl-tRNA synthe- 
tase polypeptide chain can be folded into three do- 
mains: an NHa-terminal a-helical region followed by a 
nucleotide-binding topology and a COOH-terminal do- 
main composed of a-helices which probably carries 
major sites in tRNA binding. 



The aminoacyl-tRNA synthetases are a vastly divergent 
family of enzymes differing in size and subunit structure but 
catalyzing the same reaction, the formation of an aminoacyl- 
tRNA, specific for both the amino acid and the tRNA. The 
mechanism of the aminoacylation involves the initial rapid 
formation of an aminoacyladenylate complex followed by the 
transfer of the aminoacyl moiety to the tRNA, Valyl-tRNA 
synthetase from yeast is a monomer of Mr 120,000 (Kern et 

* This work was supported by grants from the Centre National de 
la Recherche Scientiftque. The costs of publication of this article 
were defrayed in part by the payment of page charges. This article 
must therefore be hereby marked ^'advertisement'* in accordance with 
18 U.S.C. Section 1734 solely to indicate this fact. 

The nucleotide seqmnce(s) reported in this paper has been submitted 
to the GenBank'^/EMBL Data Bank with accession number(s) 
J027I9. 

i Supported by a United Nations Education, Science, and Culture 
Organization long-term fellowship. Permanent address: Faculdad de 
Medicine, Casilla 6667, Santiago 7, Chile. 

§ Supported by a European Molecular Biology Organization short* 
term fellowship. 

% Permanent address: Max-Planck-Institut fur Bxperimentelle 
MediaEin, Abteilung Chemie, Hermann-Rein-Strasse 3, D-3400 Goet- 
tingen, West Germany, 

** To whom correspondence should he addressed. 



a/,, 1975) and belongs, together with leucyl- and isoleucyl- 
tRNA synthetases, to the class of enzymes having the largest 
polypeptide chain. Activation of a single amino acid by the 
aminoacyl-tRNA synthetase is, in most cases, very specific. 
However, valyl- and isoleucyl-tRNA synthetases do not dis- 
criminate between closely related amino acids in the adenylate 
formation step. In neither of these cases, however, is the 
misactivated amino acid used to form a stable aminoacyl- 
tRNA. The mechanism of rejection is designated as a proof- 
reading or editing mechanism. The isoleucyl- and valyi-tRNA 
synthetases are known to hydrolyze the misactivated vaiyl 
and threonyl adenylates, respectively (Baldwin and Berg, 
1966; Fersht and Kaethner, 1976; Igloi et al., X977), Knowledge 
of their structure should be useful in defining structural 
elements involved in catalysis and/or specificity. The entire 
primary structure of Escherichia coli isoleucyl-tRNA synthe- 
tase has been reported (Webster et al, 19S4), We present here 
the isolation and sequence of the VASI Saccharomyces cere- 
visiae gene coding for valyl-tRNA synthetase. Comparison of 
the translated amino acid sequence with that of isoleucyl- 
tRNA synthetase from E. coli shows the strongest homology 
ever reported for two distinct aminoacyl-tRNA synthetases. 

MATERIALS AND METHODS 

Yeast, Bacteria, Plasmids, Gene Libraries, and Growth Media — The 
yeast genomic bank from S, cerevisiae strain X 2180 in phage Agtll 
and the host strain V 1090 (Young and Davis 1983a, 19a3b) were 
kindly provided by Dr. R. Young (Whitehead, MIT). The yeast 
genomic bank from S. cerevisiae strain FLIOO in the plasmid vector 
pFLl (Chevallier et al., 1980) was a gift from Dr. Lacroute (IBMC, 
Strasbourg, France). The strain FFl.l (mes!,ura3) was the recipient 
for yeast transformation (Fasiolo et ai, 1981). Parental and trans- 
formed yeast strains were grown on YNB (0.67% yeast nitrogen base 
without amino acids, 2% glucose) supplemented with 100 jug/ml 
methionine. Transformations of yeast and E, coli and preparation of 
nucleic acids were done using standard procedures. 

Enzymes and Reagents — Restriction endonucleases, T4 DNA ii- 
gase, and E. coli DNA polymerase I (Kienow fragment) were pur- 
chased from Boehringer Mannheim. [a-^^P]dATP, a-^S -labeled 
dATP, and were purchased from New England Nuclear. 

Antibody Preparation and Plaque Screening—Homogenmns yeast 
valyl-tRNA synthetase was prepared in our laboratory by Drs. D. 
Kern and R. Giege. Rabbits were immunized at 15-day intervals by 
three subcutaneous injections of 5CK) pg of enzyme dissolved in 500 ^1 
of 10 mM potassium phosphate buffer (pH 7.4), 150 mM NaCl and 
emulsified in 500 fd of complete Freund's adjuvant. One week after 
the last injection, the rabbits were bled, and the immunoglobin 
fraction was purified from the serum by ammonium sulfate precipi- 
tation and DEAB-Sephadex chromatography. Purified antibodies 
were prepared by chromatography on valyl-tRNA synthetase bound 
to succinylaminoethyl-Sepharose 4B. Ten nmol of enzyme were cou- 
pled to 5 ml of packed gel with N-cyclohexyl-N'-lj3-(N-methylmor- 
phoUno)ethyI]carbodiimide p-toluenesulfonate. 
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Screening of the Xgtll genomic library was carried out essentisdiy 
as described by Young and Davis (1983b) using affmity-purified 
antibodies at a concentration of 5-10 fig/ml and ^^I-protein A (50 
fiCi/fig) at 1 fiCi/mh Positive plaques were purified by fotir additional 
cycles of screening. 

Hybridization Procedures — DNA probes were purified by gel elec- 
trophoresis or sucrose gradient centrifugation from phage Xgtll or 
recombinant plasmlds digested with the appropriate restriction en- 
zymes. They were labeled by nick translation as described by Maniatis 
et al. (1982). DNA probes cloned in M13 phage were labeled by chain 
extension using the Klenow fragment of E. coli DNA polymerase I 
and [a-^^PJdATP. The yeast genomic bank in vector pFLl was 
screened by the high density colony-screening procedure described by 
Hanahan and Meselson (1983), Positives clones were purified by two 
additional cycles of screening. Southern blot hybridizations were 
carried out according to the procedures described by Maniatis et al. 
(1982). 

Determination of Enzymatic Activities — Cytoplasmic valyl-tRNA 
synthetase was tested in crude extracts obtained by mechanical 
breakage with glass beads of exponentially growing cells. Protein 
concentration was estimated according to Bradford (1976). 

The enzyme was tested using unfractionated yeast c3^opiasmic 
tRNA under the foUowing conditions: 144 mM Tris-HCl (pH 7.8), 5 
mM dithiothreitol, 10 mM ATP, 2 mM MgCh, 0.1 mM [^^Cjvaline 
(25,000 cpm/nmol), 6 mg/ml yeast tENA, and various amounts of 
crude extracts. The reaction mixture was 200 fxl; and at various time 
intervals, 40-^1 aliquots were spotted onto Whatman paper discs and 
quenched with 5% trichloroacetic. The precipitated aminoacylated 
tRNA was subjected to scintillation coimting. 

Western Blot — Protein samples were run on 10% polyacrylamide 
gels in the presence of 0*1% sodium dodecyl sulfate (Laemmli, 1970). 
Conditions for the transfer of proteins to nitrocellulose membranes 
were as described in the Schleicher & SchueU manual (No. 2). The 
protein band corresponding to valyl-tRNA synthetase was detected 
as described above using affinity -purified antibodies (5-10 jj.g/ml and 
i^I-protein A (0.1 ^Ci/ml). 

DNA Sequence Analysis^The dideoxy-DNA sequencing method 
of Sanger et al (1977) was used. EedBl and SatL digestions of pVASI 
recombinant generated fragments of 1.6, 1,2, and 1-3 kb, respectively. 
These DNA fragments were isolated and digested with Alul, Hoelll, 
Taql, and SauZA. The resulting subfragments as well as the original 
fragments were cloned into suitable Ml3mp8 and Ml3mp9 vectors 
(Vieira and Messing, 1982). 

Computer Analysis of Amino Acid Sequences— Amino acid se- 
quences were analyzed with programs of the University of Wisconsin 
Genetics Computer Group edited by Dereveux and Haeberli^ to locate 
sequences patterns: "Best fit" to align two sequences; **Gap" to find 
the optimal alignment for two sequences by adding gaps in either one 
to achieve the maximum number of matches; "Dotplot" and **Pepplot** 
to visualize the homology between two sequences; and "Choufas" to 
perform prediction of secondary structures. 

RESULTS 

Cloning of the VASI Gene-— We have screened a yeast DNA 
library using the expression vector Xgtll which contains 
random genomic fragments in the iinique EcoBl site (Young 
and Davis, 1983a, 1983b). Ten putative positive clones were 
obtained and further purified by three successive rounds of 
antibody screening at low plaque density after which only one 
clone remained positive. Yeast DNA inserted into the Xgtll 
recombinant is 2.5 kb,^ whereas the minimum expected length 
of the message for a protein of 120,000 (Kern et al., 1975) 
is about 3.5 kb. In order to isolate the complete gene coding 
for valyl-tRNA synthetase, we have screened the pFLl yeast 
DNA library (Chevailier et al, 1980) using the yeast ^coRI 
fragment from the Xgtll recombinant as hybridization probe. 
Only three clones (pVASI-1, -2, and -3) were purified, and 
their overlapping inserts were mapped with a number of 
restriction enzymes. Southern blot hybridization analysis of 
yeast nuclear DNA gave an identical genomic map for the two 

^ Dereveux, J., and Haeberii, P. (1983) Program Library of the 
University of Wisconsin Genetics Computer Group, Madison, WL 
^ The abbreviation used is: kb, kiiobase. 



EcoRl and Hmdlll sites (Fig. 1). 

To demonstrate that the cloned gene codes for valyl-tRNA 
synthetase, we expressed the various clones in yeast to give 
catalytically active valyl-tRNA, synthetase. The activity in 
the crude extracts of the yeast transformants (pVASI-1 and 
-2) was approximatively 10 times higher than the basal level 
of enzyme in the recipient strain. In order to verify that the 
activity was associated with a full-length protein in the over- 
producing strains, proteins from a crude cytoplasmic extract 
were separated by electrophoresis on sodium dodecyl sulfate- 
polyacrylamide gels and transferred to nitrocellulose, and 
valyl-tRNA synthetase was detected using the specific cyto- 
plasmic valyl-tRNA synthetase antibodies and ^^I-labeled 
protein A, The results of the Western blot analyses are shown 
in Fig. 2. A protein band which co-migrated with the purified 
cytoplasmic valyl-tRNA synthetase was detected in the crude 
extract of the recipient strain {lane 2), The concentration of 
this protein was increased {lanes 3 and 4) in yeast transform- 
ants harboring the VASI gene on a multicopy plasmid 
(pVASI-1 and ~2). The level of valyl-tRNA synthetase in the 
transformant corresponding to clone pVASI-3 was again sim- 
ilar to the basal level of the recipient strain and was probably 
due to lack of the 5 '-upstream promoter sequences. 

Determination of the Nucleotide Sequence of the VASI 
Gene — We have determined 80% of the entire sequence on 
both strands, and on one strand, the remaining 20%. All 
restriction endonuclease sites used for generating Ml3 clones 
were overlapped. This strategy enabled us to localize a 78- 
base pair ^coRI fragment between the large 1.6- and 1.2-kb 
i^coRI subfragments. A long open reading frame of 3,312 
nucleotides was found only on one strand (Fig. 3). The trans- 



(b) pVAS-1 i i i 

pVAS-2 -J i i 1— 

pVAS-3 i t L_. 

FiG. 1. Restriction map in the VASI genondc region, a, the 

restriction map was determined by Southern analysis using yeast 
genomic DNA. The box indicates the extent of the VASI coding 
region. The numbers refer to the size (in kilobases) of the EcoWL 
fragments, b, clones obtained from the screening of the pFLl yeast 
DNA library are designated pVASI-1 to -3. E, EcoBl; mndlll; Hp, 
Hpah pVASI clones were aligned with respect to EcoBl fragments. 




Fig. 2. Western blot of valyl-tRNA synthetase in crude ex- 
tracts from recipient and yeast transformants. Lanes 1 and 6, 
100 ng of purified cytoplasmic valyl-tRNA synthetase; lanes 2-5^ 30 
/ig of cytosol protein from recipient {lane 2), transformant pVASM 
{lane 3), transformant pVASI-2 {lane 4), and transformant pVASI-3 
{lane 5). 
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^.ttJAATi^oixxnTAiUVCACATrAltrTA^aACAaTC^ 

«lfKHl.MTLSKTrTFItLLIlCHYRHSl,rLi.vnir»» 

oieKEKKKAEKttKrAAKQAKKHAAATiUAow H 
12i pipEFiOKTVPGEKKILVSLDDfAUR^snrnw 

201 ALTIAlODStlRYNftHKGKTVbi-i-fi^rwn""*^ 

721 CAAA«W:TAA0GACAbAAAOACTAoi^^ 

e«l GOOOOTCITATOAlTGOA^aXrCAAG^ 

Li GITAAn«ritrnnTAAAl^AATACCGciATCTCT 

32X ViiWSVKLHiAI^""**''" ^ ^ ^ 

lOei rrAACAICATn^nTATCCACTTATCod-A^ g 

u" LlAcL^ATO^AACrkATCCAACAi^^ | 

U21 CATCACCAAAACOAITACMTACC^ g 

X«l GAYCACAAAGj^OOTcW 1 

lZ\ LcCITAATtK^^ I 

204! pX^ACCOCnrCAOXTCCATixrj^GGAA^^^ ^ 
2161 ATtAAGrroGAT^ATTOX ^ 

2281 GATXxrrATciGGTTrcxrAiiAttmxrrrATACCAC^^ 5 

761 DAHRFALCAYTTGGRDINLDILRVEGYRKFCHKItQATKF ^ 

2401 GCATTCATGAGACTCGCTrOACGArrATCAACCACCrOCCAC^ 
901 ALHRLGDDYQPPATEGL3GHESLVEKWILMKLT£.^3Kiv» 

252 1 OAAOCTCTAOATAAACGlXJACTOriTGACi^ 
841 EAtDKRDFLTST3SIYEFHYLICDVYlENSKyi*iyfc.t»»Ai 

264 1 GAAAAOAAGTCCGC AAAGGATACArrCTAkTCTTGCTCK?^^^ 
881 EKKSAKDTLYILLDNALKLIHPFMPFI5EEW«yMi,fR K»i. 

2 761 GAOAAOGCI^CTCAATnrrAAAAGCTTdTATCCAGm^ 
921 EKAA3IVKA3YPVYVSEYDDVKSANAYDtVLJITKEAR3L 

2881 TOTCTOACTACAATAITTOJAAGAATG^ 

JOOl GAAGICACTOITOITCaroATCClTCC|^ 

,121 aCGAAAOTTCAAAACAAAciT|AA« 

3241 AAGCTGGATAACAClXrrrGCCGAAAlXrGAAOGTITCXSAAC^ 3315 

1081 KLDNTVAEXEGLEATI ENLKRLKL* 



lated amino acid sequence from the first in-phase methionine 
codoB includes 1,104 amino acid residues, yielding a protein 
of Mr 125,000, in good agreement with the Mr measured for 
the purified protein. Attempts to define the NH2-terminal 
peptide of the protein were unsuccessful due to a blocked NH2 
terminus. 

DISCUSSION 

Sequence homologies among different aminoacyl-tRNA 
synthetases, with the exception of those specific for the same 



amino acid in different organisms, are rare or nonexistent. 
Similarities of the three-dimensional level of these enzymes, 
however, are expected to be much greater due to structural 
constraints imposed on the binding of tRNA which probably 
shares the same tertiary conformation (Moras et al, 1980) 
and to the necessity of bringing the adenylate site close to the 
terminal adenosine site of tRNA in order to achieve the 
chemical acylation step. Since the ATP and the 3' -CCA arm 
of tRNA are common to all aminoacyl-tRNA S3mthetases, it 
is reasonable to assume that identical or at least functionally 
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equivalent residues are present in many aminoacyl-tRNA 
synthetases. Hence, a comparison of primary sequences can 
be useful to identify important binding and/or catalytic resi- 
dues. A classical example derives from a structural comparison 
of E. coli methionyl-tRNA synthetase and Bacillus stearo- 
thermophilus tyrosyl-tRNA synthetase (Blow et al, 1983). 
The three-dimensional structures of both enzymes indicate 
folding of the NHa-terminal regions into similar and charac- 
teristic nucieotide-binding domains, although there is only a 
short stretch of amino acid sequence homology. In particular, 
1 cysteine and 2 histidine residues occupy identical positions 
in the two tertiary structures (Barker and Winter, 1982; Blow 
et al, 1983). These conserved residues are involved in the 
binding and catalysis of adenylate formation as demonstrated 
by site-directed mutagenesis experiments (Winter et aL, 1982; 
Leatherbarrow et at., 1985). 

The NH2-terminal region of E. coli isoleucyl-tRNA synthe- 
tase shows a sequence homology of 11 consecutive amino 
acids with the corresponding region of E. coli methionyl-tRNA 
synthetase which allowed the authors (Webster et al., 1984) 
to conclude that isoleucyl-tRNA synthetase is similarly folded 
in an alternating 13/a structure. The perfect peptide match 
includes the consensus HIGH region involved in ATP binding 
(see below). 

We have compared the deduced amino acid sequences of 
yeast valyl-tRNA synthetase and E, coli isoleucyi-tRNA syn- 
thetase. Residues 177-726 of the yeast enzyme could be 
aligned with residues 50-618 of the bacterial enzyme (Fig. 4). 
Fig. 4 shows four short perfect matches of 5-13 conserved 
residues at the following peptide positions in the yeast se- 
quence: 196-200, 431-435, 564-570, and 700-712. The overaU 
homology is 23%.^ Two functional regions can be recognized 
within this homology; one at the ATP-binding site and the 
other at the probable CCA-binding site of tRNA. 

Homology at the ATP-binding Site — Fig. 5 compares the 
homologies centered around the HIGH region of tyrosyl- 
tRNA synthetase from B. stearothermophilus, methionyl- and 
isoleucyl-tRNA synthetases from E. coli, and the methionyl- 
and valyl-tRNA synthetases from S, cerevisiae. The impor- 
tance of the HIGH region in ATP binding and catalysis has 
become apparent from the studies of Fersht et at. (1984). This 
region is in the NH2-terminai portion of the bacterial enzymes 
mentioned above, as is the case for the majority of prokaryotic 
tRNA synthetases; whereas we located the HIGH sequence 
in both yeast methionyl- and valyl-tRNA synthetases to ap- 
proximately 200 amino acid residues from the NH2-terminus. 
That this region corresponds to the ATP-binding site in yeast 
valyl-tRNA synthetase can be deduced by analogy with sim- 
ilar positions of the folded a/0 topology in yeast methionyl- 
tRNA synthetase (Walter et at, 1983). Thus, the two yeast 
enzymes bear an NH2 -terminal chain extension with respect 
to the mononucleotide binding fold. In yeast valyl-tRNA 
synthetase, this NH2-terminal extension is mainly an a- 
helical region as deduced from predicted secondary structures. 

Homology at the CCA-binding Site of tRNA — Covalent la- 
beling of methionyl-tRNA synthetase from E. coli with 2',3'- 
dialdehyde tRNAg?* has led to the identification of a peptide 
encompassing Lys-335 (Hountondji and Blanquet, 1985). Al- 
though the exact position of this lysine residue in the crystal 
structure has not yet been located, it is part of the COOH- 
terminal helical domain of the synthetase (see Brunie et al 




^ The sequence of the E. coli gene coding for valyl-tRNA synthetase 
was sent to us before publication by Dr. R. Leberman (LEBM, 
Grenoble, France) and co-workers. It turned out that the protein 
sequence was 45% homologous to the yeast enzyme and 23% homol- 
ogous to the coli isoleucyl-tRNA synthetase. 
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240 SirriffiGIXLPTA03LLH0I»TKRI«QiDGQAVILAKDLVESVHORIGV^ 289 
416 KIPIITDKEA /. - , . VCMEFGrGAVKITPAHD 442 

I I f ( I n 1 1 I i 

290 YTrLGTVXGADVELUtFTHPFHCa^lJVPAtlXa^^ 339 
443 QlfDyKrGKi«^ltC^IHIt.TDC)GLL»IEE^ 492 

Ml 111 n i 1 I i 1 j i 

340 PDDWIG0KYGI^AHPVGPDGTYLP<nifPTU3GVHVr^^ 389 
493 KKLYVGflEdb>friPTC3Rk;DIIEPLLKP(JI#IVS0SEMAK^ S42 

t f ! t I I inn i 

390 KGALLHVEKHQHSYPCCHRKiaT»IIFRATP9W^HDQKGUiAQ3UCEIK 439 
543 9ITrrPK33EyiEYF»ax;KiaOHCI3i»0ti*IGHRCPVYFiKIBGE^^ 592 

1 I I n M { 1 1 M i ! i 

440 GVQWIPCNCQARIESKVAKRPrilCISRQRaWVPteiJFVHKDr^^ 489 

593 DG0YWVAORaKEEAEXKAAAKYFil3Krrii»te!D^^ 640 

490 tJKWCKVAlWVEVOGlOAHWDUVlKEllfiDeADOyVKV^^ 539 

641 TLGHPEimCDMroTPraKL£OT«>ILFF>^^ 690 

j E I i I I { { 

540 TH»3VVU^EFAOHAADKyLBG3D0HR<»«FW3SL«X5TAMICCKAPyceV S89 

691 FCH3LV|4BAQGIO(«»KSLG»VIDPI^ 740 

I t MtMin II I n II 

590 LWGrruiXJ(?GRKH3K3ICiriV3P0EVM^^ 639 

741 BCAKIGOKESYPHCIPQCtmJAHRFALCAYTTGGIUJIMXJ)!!^^ 790 

It It t It 

640 , U?HAAIXJYRRlIW'ARFUAIIIJIGFI)PAK)DHWtPEE^^ 688 

791 CimiY0ATKFALKRU^Y6pPATEGt3GHESLVIXHlU*KLTS1^KI^ 840 

II it 
669 VAXyVKAAQEClUCAVEMKYCiTISWfiRUS^ 738 

841 SALiaOU5rLi^rr3SIYEFWyL2CD\nfXEll3KYI.I0EX^tElCKSAK^ 890 

It! II 
739 KirrWAR»3CQTAJ:.YHIAEA2-VRHKAPIl3FTABEart«YtPGEf^ 788 

891 ILI^AlJaiMPFMPFIsdlWORl^KR-a'^StAASIVKAS 939 

I I t i I } I 

789 GENYEX;LPSLAD3EAKIfX}Afl4DBIXK^^ 838 

940 l)WK3AKAYl£vU(irKEARSLl3EVNlLIC«GKVFVES»^^ 989 

till ! if 

839 AVTiyAEPEI^3AKLTALGimJ«=VIxySDiaiYVAOYin»PADA^ 888 

990 DSlV3LIKA^l}K^rrVVIUM^SEIPBGCVl^SVMFt.VWVKIXVKG 1032 

I { } I t I i t I 

869 GUtVAWKABGBKCPRCItiYl^DtfCKVABiAEXCGBCVSIIVAG 931 

Fig. 4. Homology between the amino acid seciuextces of 
yeast valyl-tKNA synthetase and M, coli isoleucyl-tKNA syn- 
thetase. The comparisons shown in hoth A and B use programs from 
the University of Wisconsin Genetics Computer Group. The £. coli 
sequence is from Webster et oL (1984)- The comparison in A uses the 
Dot Matrix program. Average score values were calculated for pairs 
of 25 -amino acid segments using the mutation matrix of Staden 
(1982). If the average score value was equal to or greater than 25, a 
dot was printed at the corresponding position of the matrix- In B» the 
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(a) 


TyrRSbs 


33 


tYCGrOPTADSL 


H 


1 


G H L A T I 


52 


(b) 


MetRS 


9 


VTCALPYANSSr 


K 


L 


G H N L E H 


2S 


(c) 


HetRSsc 


ZOO 


ITSAtPYVNNVP 


H 


L 


G N I I G S 




(d) 


IleRS 


53 


LHDSPPYANGSI 


H 




G H S V N K 


72 


Ce) 


V«1RSsc 


183 


IPAPPPNVTGAL 


H 


1 


G H A L T.I 


204 



Fig. 5. AligBmenl of the amino acid sequences from the 
HIGH regions. The numbering indicates the distance from the NHa 
terminus. The letters in parentheses indicate the reference of the 
sequence: a, Winter et ol, 1983; 6, Barker and Winter, 1982; c, Walter 
et a/., 1983; d, Webster et al., 1984; and e, this work. TyrRSbs, B. 
stearothermophilus tyrosyl-tRNA synthetase; MetRS, E, coli meth- 
ionyl-tRNA synthetase; MetRSsc, S. cerevtstae methionyl-tRNA syn- 
thetase; IleRS, E, coli isoleucyl-tRNA synthetase; ValRSsc, S, cere- 
visiae valyl-tRNA synthetase. 



(a) 


TyrRS 


223 


TVPLITKADGTKFGKL- 


T 238 


{t>) 


MfctRS 


329 


NGAKMSKSRGT-FIKAS 


344 


(c) 


MetRSsc 


522 


EHGKFSKSRGV 


532 


W 


IleRS 


599 


QGRKMSKSIGKTVSPQO 


615 


(e) 


ValRSsc 


im 


QGRKHSKSLGNV lOPtO 


716 



Fig. 6. Alignment of the amino acid sequences around the 
KMSKS regions. The origins of the sequences are indicated by the 
same nomenclature used in Fig. 5. The numbering indicates the 
distance from the NHs terminus. The references are as follows: a, 
Barker et oL, 1982a; b. Barker et aL, 1982b; c, Walter et aL, 1983; d, 
Webster et al., 1984; and e, this work. 

cited in Hountondji and Blanquet, 1985). The functional 
importance of the tRNA synthetase region corresponding to 
Lys-335 is further supported by labeling of a similar sequence 
in E. co/i tyrosyl-tRNA synthetase using [^^CltRNAS' (Houn- 
toudji et al, 1986a). The labeled lysines at positions 229, 234, 
and 237 belong to a sequence which is highly conserved in B, 
stearothermophilus tyrosyi-tRNA synthetase (Winter et al, 
1983), and their spatial positions were deduced by analogy 
with the known three-dimensional structure of the homolo- 
gous B. stearothermophilus enzyme (Hountondji et al, 1986a). 
These lysines are part of the COOH-terminal domain, in the 
middle of the ^-turn joining the last i^-strand of the nucleotide 
domain to the first helix of the helical domain (Bhat et al, 
1982), hence in close contact with the adenylate site. The 
corresponding lysines in the B. stearothermophilus enzyme 
are located at positions 225, 230, and 233. Bedouelle and 
Winter (1986) could demonstrate that mutations at Lys-151, 
Arg-207, and Lys-208 also affect the binding of the 3 '-end of 
tRNA. These results are not conflicting since the residues lie 
on the rim of the tyrosyl adenylate pocket (Bedouelle and 
Winter, 1986). Fig. 6 indicates the alignment of the reactive 
lysines characterized in methionyi- and tyrosyl-tRNA synthe- 
tases from E. coli with similar regions of E. coli isoleucyl- 
tRNA synthetases and the yeast valyl- and methionyi -tRNA 
synthetases. A more complete overview of similar regions of 
other aminoacyl-tRNA synthetases is presented by Houn- 
tondji et al. (1986b). This comparison indicates the presence 
of the relevant KMSKS sequence which probably represents 
the consensus sequence of the binding region of the 3 '-end of 
tRNA. This sequence is also conserved in the primary struc- 
tures of the three homologous tryptophanyl-tRNA synthe- 



amino acid sequences of the two proteins were aligned by the Align 
program. Breaks In the sequence are shown as dots^ and identities 
between amino acid residues are shown by vertical lines. The number 
at the beginning of each line corresponds to the number of the residue 
in the yeast protein sequences {upper line) and the E. coli protein 
sequence (lower line). 



tases of prokaryotic and eukaryotic origins (Myei« and Tza- 
goloff, 1985). Fig. 6 emphasizes the fact that the KMSKS 
region is conserved in the valyl/isoleucyl-tRNA synthetase 
pair within the perfect match of 13 amino acid residues. 

E. coli methionyl-tRNA synthetase is structurally similar 
to the B, stearothermophilus tyrosyl-tRNA synthetase (Zelwer 
et al, 1982; Bhat et al, 1982). They are biglobular enzymes 
composed of an NHg-terminal a/P domain (approximately 
200 residues) connected through a long loop to an a-helical 
rich COOH-terminal domain. The latter is responsible for 
tRNA binding. This is seen in the tyrosyl-tRNA synthetase 
by protein engineering (Bedouelle and Winter, 1986) and by 
creating a deletion in the corresponding gene so as to remove 
100 residues in the COOH-terminal region, yielding a trun- 
cated enzyme able to activate the amino acid but unable to 
carry the aminoacylation step (Waye et al, 1983). Since the 
CCA arm is close to the adenylate site, the geometry of the 
tRNA molecule imposes interaction of the anticodon stem 
with the COOH-terminal end of the enzyme at a distance of 
75 A from the 3 '-end of tRNA. Protein engineering confirms 
that two separated clusters of basic residues Arg-368~Arg-371 
and Arg-407-Arg-408-LyS"410-Lys-411 at the end of the poly- 
peptide chain of tyrosyl-tRNA synthetase from B. stearo- 
thermophilus ( Ala-419) interact with the anticodon stem. The 
correlation between each catalytic function of the tRNA syn- 
thetase and the existence of a distinct structural domain was 
postulated earlier (Jasin et al., 1983) and can also be deduced 
in the case of yeast valyl-tRN A synthetase from the presence 
of the relevant amino acid sequences mentioned above. 

Amino acid residue 200 would grossly define the beginning 
of the nucleotide binding fold, and the KMSKS region at 
residue 702 would locate the beginning of the a-helical COOH- 
terminal domain. In that respect, we notice the presence of 
an a-helical region in the COOH-terminal part of the enzyme 
according to secondary structure prediction. Furthermore, the 
presence of a cluster of lysines from residues 952 to 1054 may 
represent potential contact points with tRNA^*^ anticodon 
stem. 

We asked the question whether the homology between 
isoleucyl- and valyl-tRNA synthetases is indicative of a func- 
tional relationship (the isoleucyl-tRNA synthetase misacti- 
vates valine) or of an evolutionary relationship between these 
two molecules. Twenty percent sequence homology, as re- 
ported here for valyl- and isoleucyl-tRNA synthetases, has 
only been observed to date for those enzymes specific for the 
same amino acid but isolated from different organisms, Le. 
the methionyl-tRNA synthetase pair from E, coli and yeast; 
the homology is even larger for the threonyl/tryptophanyl- 
tRNA synthetase pairs from E, coli and yeast and the tyrosyl- 
tRNA synthetase pair from E, coli and B. stearothermophilus 
(cited by Hountondji et al, 1986b). In contrast, no homology 
has been identified between two distinct aminoacyl-tRNA 
synthetases specific for a given amino acid, except for the 
functional regions mentioned above. In particular, there is no 
homology between yeast valyl-tRNA synthetase and E. coli 
threonyl-tRNA synthetase which could have explained the 
misactivation of the isosteric valine-threonine amino acids. 
Rather, the homology between valyl- and isoleucyl-tRNA 
synthetases reported in this work suggests an evolution from 
a common ancestral gene. 

Acknowledgments — We thank Professor Y. Boulanger and Dr. P. 
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the skillful technical assistance of G, Nussbaum. 



7194 



Homology of Yeast Valyl- and E, coli Isoleucyl-tRNA Synthetases 



REFERENCES 

Baldwin, A. N., and Berg, P. (1966) J, Biol Chem. 241, S39~842 
Barker, D. G., and Winter, G, (1982) FEBS Lett. 145, 191-193 
Barker, D. G., Bruton, C. J., and Winter, G. (1982a) FEBS Lett. 
150, 419-423 

Barker, D. G., Ebel, J. P., Jakes, R., and Bruton, C. J. (1982b) Eur. 

J, Biochem, 127, 449-457 
Bedouelie, H., and Winter, G. (19^) Nature 320, 371-373 
Bhat, T. N., Blow, D. M., Brick, P., and Nyborg, J. (1982) J, MoL 

Biol 158, 699^709 
Blow, D. Bhat, T. N., Metcalfe, A., Risler, J. L., Brunie, S., and 

Xeiwer, C. (1983) J. MoL Biol 171, 571-576 
Bradford, M. M. Anal. Biochem, 72, 248^254 

Chevallier, M. R., Bloch, J. C, and Lacroute, F. (1980) Gene {Amst) 

11, 11-19 

Fasiolo, F., Bonnet, J., and Lacroute, F. (1981) J. Biol Chem, 256, 
2324-2328 

Fersht, A. R, and Kaethner, M. M. (1976) Biochemistry 15, 3342™ 
3345 

Fersht, A. H„ Shi, J. P., Wilkinson, A. J., Blow, D. M., Carter, P., 
Waye, M. M. Y., and Winter, G. P. (1984) Angew. Chem. Int. Ed. 
Engl 23, 467-473 

Hanahan, D., and Meselson, M. (1983) Methods Enzymol 100, 333- 
342 

Hountondji, C, and Bianquet, S, (1985) Biochemistry 24, 47-52 
Hountondji, C*, Lederer, and Blanquet, S. (1986a) Biochemistry 
24, 1175-1180 

Hountondji, C, Dessen, Ph., and Blanquet, S. (1986b) Biochimie 

{Paris) 68, 1071-1078 
Igloi, G. L., von der Haar, F,, and Cramer, F. (1977) Biochemistry 

16, 1696-1700 



Jasin, M., Regan, L., and Schimmel, P. (1983) Nature 306, 441-447 
Kern, D., Giege, R., Robbe-Saul, S., Boulanger, Y., and Ebel, J. P. 

(1975) Biochimie {Paris) 57, 1167-1176 
LaemmH, U. K. (1970) Nature 227, 680-685 

Leatherbarrow, R. J„ Fersht, A. R., and Winter, G. (1985) Proc. Natl 

Acad. ScL U. S. A. 82, 7840-7844 
Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular 

Cloning: A Laboratory Manual Cold Spring Harbor Laboratory, 

Cold Spring Harbor, NY 
Moras, D., Comarmond, M. B., Fischer, J., Weiss, R„ Thierry, J, C, 

Ebel, J, P., and Giege, R. (1980) Nature 288, 669-674 
Myers. A. M., and Tzagoloff, A. (1985) J. Biol Chem. 260, 16371- 

15377 

Sanger, F., Nicklen, S-, and Coulson, A. R. (1977) Proc. Natl Acad. 

Scl U. S. A. 74, 5463-5467 
Staden, M. (1982) Nucleic Acids Res. 10, 2951-2961 
Vieira, J., and Messing, J, (1982) Gene (Amst) 19, 259-268 
Walter, P., Gangloff, J., Bonnet, J., Boulanger, Y., Ebel, J. P., and 

Fasiolo, F. (1983) Proc. Natl Acad. ScL U. S. A, 80, 2437-2441 
Waye, M. M. Y., Winter, G., Wilkinson, A. J., and Fersht, A. R 

(1983) EMBO J. 2, 1827-1829 

Webster, T., Tsar, H., Kula, M,, Mackie, G. A., and Schimmel, P. 

(1984) Science 226, 1315^1317 

Winter, G., Fersht, A. R., Wilkinson, A. J., ZoUer, M., and Smith, M. 

(1982) Nature 299, 756-758 
Winter, G., Koch, G. L. E., Hartley, B. S., and Barker, D. G. (1983) 

Eur. J. Biochem. 132, 383-387 
Young, R, A., and Davis, R, W. (1983a) Proc. Natl Acad. ScL U. S. A, 

80, 1194-1198 

Young, R. A., and Davis, R. W. (1983b) Science 222, 77S-782 
Zelwer, C, Risler, J. L., and Brunie, S. (1982) J. Mol Biol 155, 63- 
81 



The Journal or Biological Chemistry 

<^ 1988 by The American Society for Biochemistry and Molecular Biology, Inc- 



VoL 263, No. 2, Issue of January 15, pp. 868~S77. 1988 
Printed in U.S.A. 



Valyl-tRNA Synthetase Gene of Escherichia coH K12 

PRIMARY STRUCTURE AND HOMOLOGY WITHIN A FAMILY OF AMINOACYL-TRNA SYNTHETASES* 

(Received for publication, June 8, 1987) 

J. Denis Heck* and G. Wesley Hatfield§ 

From the Department of Microbiology and Molecular Genetics, Caltfornia College of Medicine, University of California, Irvine^ 
Irvine, California 92717 



The DNA nucleotide sequence of the valS gene en- 
coding valyl-tRNA synthetase of Escherichia coli has 
been determined. The deduced primary structure of 
valyl-tRNA synthetase was compared to the primary 
sequences of the known aminoacyl-tBNA synthetases 
of yeast and bacteria. Signi^cant homology was de- 
tected between valyl-tHNA synthetase of E. coli and 
other known branched-chain aminoacyl-tRNA synthe- 
tases. In pairwise comparisons the highest level of 
homology was detected between the homologous valyl- 
tRNA synthetases of yeast and E* coli, with an observed 
41% direct identity overall. Comparisons between the 
valyl- and isoleucyl-tRNA synthetases of E. coli 
yielded the highest level of homology detected between 
heterologous enzymes (19*2% direct identity overall). 
An alignment is presented between the three branched- 
chain aminoacyl-tRNA synthetases <valyl- and isoleu- 
cyl-tRNA synthetases of E, coli and yeast mitochon* 
drial leucyl-tRN A synthetase) illustrating the close re- 
latedness of these enzymes. These results give credence 
to the supposition that the branched-chain aminoacyl- 
tRNA synthetases along with methionyl-tRNA synthe- 
tase form a family of genes within the aminoacyl-tRNA 
synthetases that evolved from a common ancestral pro- 
genitor gene. 



As a group the aminoacyl-tRNA synthetases of Escherichia 
coli are responsible for performing the same essential cellular 
function, the aminoacylation of tRNA, However, among the 
individual members of this class of enzymes there exists a 
high degree of diversity with regards to the overall sizes (1) 
and quarternary structures (2), While some individual size 
differences may possibly be ascribed to additional domains 
which serve functions other than those immediately required 
for the catalysis of tRNA aminoacylation, such as subunit- 
subimit interaction (3), autoregulatory functions {4, 5), or 
protein folding and t-RNA conformation constraint domains 
(6), the fact that only limited primary sequence homology is 
observed in pairwise comparisons of the amino acid sequences 
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of these enzymes is seemingly inconsistent with synthetases 
sharing a common evolutionary origin or relatedness. In the 
vast majority of cases only pockets of limited primary se- 
quence homology, occurring predominately within the amino- 
terminal half of these enzymes, have been discerned in these 
comparisons. With the singular exception of a comparison 
between the primary structure of yeast valyl -tRNA synthetase 
and isoleucyl-tRNA synthetase of E, coli no extended regions 
of primary sequence homologies between heterologous ami- 
noacyl-tRNA synthetases have been observed to date (7). 

We report here the entire nucleotide sequence of the valS 
gene encoding valyl-tRNA synthetase along with the corre- 
sponding deduced amino acid sequence. Homology compari- 
sons between the deduced primary structure of valyl-tRNA 
synthetase and the primary structures of the other known 
aminoacyl-tRNA synthetases are described. Additional cor- 
roborative evidence of the substantial degree of relatedness 
which exists between the heterologous valyl-tRNA and isoleu- 
cyl-tRNA enzymes is presented in comparisons of these two 
enzymes from E. coli Kl2. Common sequence homologies with 
other branch-chained aminoacyl-tRNA synthetases strongly 
support the hypothesis that these enzymes evolved from a 
common progenitor gene. 

EXPERIMENTAL PEOCEDURES 

Materials — All restriction endonucleases and other enzymes were 
purchased from Boehringer Mannheim Biochemicais or New England 
Biolabs, Inc. All radioactive compounds were obtained firom Amer- 
sham Corp, All chemicals were either from Sigma or Maliinckrodt 
Chemical Works, 

Nucleotide Sequencing Analysis — DNA sequence analysis was per- 
formed according to the dideoxy chain termination method of Sanger 
et al, (8). The identity of each nucleotide of the noncoding strand was 
verified by the independent determination of the complete DNA 
sequence of both strands, with some portions of each strand repeat- 
edly analyzed from overlapping sequential deletions as illustrated in 
Fig. 1. The method of Henikoff (9) was employed for the generation 
of the M13 sequential deletion derivatives utilized in DNA sequence 
analysis. Screening of the potential sequential deletion derivatives 
prior to DNA sequence analysis was accomplished by either using 
representative cloned derivatives as template DNAs in dideoxy-T 
sequencing reactions (10), followed by electrophoresis in a buffer 
gradient ^1 (11) and autoradiography to allow banding pattern com- 
parisons to be made, or by determining the relative mzBS of the 
purified single-stranded deletion DNAs directly by electrophoresis in 
a 1,0% agarose gel buffered with 2 x Hellings (12). 

DNA— MIS RF DNA (13) and cesium chloride band-purified plas- 
mid DNA were prepared by standard methods (14). 

Computer Analysis of Nucleotide and Amino Acid Sequences — 
Analyses of the determined nucleotide sequences were facilitated by 
use of the DNA Inspector II program (15). Analyses of the deduced 
amino acid sequence of valyl-tRNA synthetase and comparisons of 
the primary structure of valyl-tRNA synthetase with the other known 
aminoacyl-tRNA synthetase primary structxires were accomplished 
by use of the programs (GENED, SEQ, and PEP) from BIONET« 
Intelligenetics (16) along with the programs (Codon Frequency, Best 
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Fig. 1. Partial restriction endo- 
naclease map and DNA sequencing - - 
strategy of the MJtUS gene of K colt 
iC12. The thicker lined portion of the 
restriction map represents DNA se- 
quences of the naturally occurring ColEl 
plasmid portion present in the Clarke- 
Carbon library plasmid pLC26-22 (18). 
Recombinant Ml3mpl0 and 11 or 
MlSmplS and 19 containing either dis- 
crete restriction fragments or sequential 
derivatives (9) of DNA excised from 
plasmid pLC26-22 were used as tem- 
plates in DNA sequence determinations. ^ 

The extent of the arrows beneath the 
restriction map represents the breadth 
of readable DNA sequence determined 
by the dideoxy chmn termination 
method (8). The large open arrow below 
the partial restriction endonuclease map f 
designates the DNA region encoding the 460 bp 

ualS structural gene. 



Fit, Gap, and PepPlot) of the University of Wisconsin Genetics 
Computer Group (17). 

RESULTS 

Determination of the Nucleotide Sequence of the valS Gene 
of E. co/£— Starting with a hybrid plasmid of the Clarke- 
Carbon E. coli library (18), the valS structural gene was 
stibcloned and molecular genetic elements responsible for volS 
expression were characterized (19). Based on the physical 
map of the valS gene, specific DNA restriction endonuclease 
fragments were isolated and inserted into bacteriophage 
MlSmplO or mpll* The nucleotide sequences of these valS 
gene restriction endonuclease fragments were determined by 
the dideoxy-chain termination method (8), Additionally, 
larger-sized DNA restriction endonuclease fragments encom- 
passing the distal four-fifths of the valS structural gene (rang- 
ing from 1.4 to 2.2 kb^ in size) were isolated and inserted into 
bacteriophage M13mpl8 or mpl9. The repiicative forms (RF) 
of these M13 bacterial phages were utilized to generate a 
series of sequential deletion derivatives spanning both strands 
of the DNA encoding valyl-tRNA synthetase (9). The nucleo- 
tide sequences of these va^ gene M13 deletion derivatives 
were also determined by the method of Sanger et aL (8). The 
sequencing strategy employed illustrates that the nucleotide 
sequence was independently obtained for both strands of the 
DNA with much of the nucleotide sequence of each strand 
repeatedly determined from analysis of overlapping sequential 
DNA segments (Fig. 1). The 2856 nucleotide DNA sequence 
of the sense strand of the valS gene is shown in Fig. 2. 

Localization and Characterization of the ualS Gefie— The 
deduced amino acid sequence of the valS gene is shown 
immediately below the determined nucleotide sequence in Fig. 
2. Determination of the purified valyl-tRNA synthetase pro- 
tein amino-terminal sequence^ has confirmed the proposed 
translational start of valS (19). An open reading frame of 
2856 nucleotides, extending from the amino-terminal methi- 
onine codon, encodes a deduced polypeptide of 951 amino 
acids in length. The calculated molecular weight, 108,070, is 
in close approximation to the previously determined molecu- 
lar weight of 110,000 for valyl-tRNA synthetase (20). The 
deduced amino acid composition of the valS structural gene 
is in very close agreement with the amino acid composition 
obtained from protein hydrolysis of purified valyl»tRNA syn- 



^ The abbreviations used are: kb, kilobase; bp, base pair. 
^ W.-C. Chu and J. Horowitz, personal communication. 
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thetase (the observed differences between the total percent- 
ages of individual amino acids is < 1.1%).^ It should be pointed 
out that while the deduced and hydrolyxed purified protein 
amino acid compositions are in close agreement they both 
differ markedly from the previously determined amino acid 
composition for valyi-tRNA synthetase of E. coli (20). A 
comparison of the codon frequency usage of valS with the 
average frequency of codon usage obtained from analysis of 
25 abundant E. coli genes is presented in Table I. The per- 
centage of codon usage for respective amino acids within valS 
versus the average utilization observed in 25 abundant genes 
shows the same general trends (21). Specifically, the frequency 
of rare codon usage in vaiS closely mimics the average ob- 
served in the other genes. 

Finally, in contrast to earlier reports (22), the deduced valS 
primary sequence does not contain any significant repeat 
units. The purported existence of these repeat units, thought 
to be the result of a gene duplication/fusion event, was used 
to partially explain the large differences observed in the 
molecular weights within the aminoacyl-tRNA synthetases. 
While there is the hint of an internal repeat element within 
the valS deduced primary structure (at amino acid residues 
328-343 and from 924 to 939, Fig. 2), the fact that there are 
no significant repeat units within this and other large ami- 
noacyl-tRNA synthetases strongly argues that these polypep- 
tides did not arise from a gene duplication/fusion event (23). 

Primary Structure Homology between Valyl-tRNA Synthe- 
tase and Other Aminoacyl-tRNA Synthetases — Utilizing avail- 
able computer programs (c/. "Materials and Methods"), we 
have compared the deduced primary structure of the valS gene 
with the primary structures of the alaS, glnS, gltX, glyS, hisS, 
ileS, metGy pheS, serS, thrS, trpS, and tyrS genes of E, coli 
and the MSLl and VASI genes of Saccharomyces cerevisiae, 
which encode yeast mitochondrial leucyl-tRNA synthetase 
and cytoplasmic valyl-tRNA synthetase, respectively (Refs. 
4, 25-36, and 7, respectively). As expected, the strongest 
overall homology is detected when comparing the deduced 
primary structures of the unrelated homologous valyl-tRNA 
synthetase enzymes. Based on the sequence alignment shown 
in Fig. 3, there is a 41% overall direct amino acid correspond- 
ence between the two deduced primary sequences of vaiyl- 
tRNA synthetase obtained from yeast and bacteria. When the 
percent direct amino acid identity is calculated only for the 
amino proximal two-thirds of the two primary sequences the 
identity level rises to approximately 48.3%, reflecting the fact 
that the more strongly conserved regions are found toward 
the respective amino termini, the carboxyl-terminal portions 
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ATG CAA AAC ACA TAT AAC CCA CAA GAT ATC GAA CAC CCG CTT TAC GAG CAC TGG CAA AAG CAC GCC TAC TTT AAG CCT AAT GGC GAT GAA 
M«t Glu Lys Thr Tyr Asn ?ro Gin Asp He Glu Gin Pro Leu Tyr Clu His Trp Clo Lys Gin Cly Tyr Ph« Lys Pro Asn Gly Asp Gla 
I 10 20 30 

J 90 220 250 

ACC CAC GAA ACT TTC TGC ATC ATC ATC CCG CCG CCG AAC GTC ACC GGC ACT TTG CAT ATG GCT CAC GCC TTC CAG CAA ACC ATC ATC GAT 
Ser Gin Glu Ser Phe Cys He Ket Xl« Fro Pro Pro Asn Val Thr Gly S«r Uu His Met Cly Hl» Ala Phe Gin Gin Thr He Met Aap 

40 50 60 

280 310 340 

ACC ATC ATC CGC TAT CAG CCC ATG CAG GGC AAA AAC ACC CTC TGG CAC CtC CCT ACT CAC CAC GCC C€G ATC GCT ACC CAG ATG GTC GTT 
Thr Ket He Arg Tyr Gin Arg Met Gin Gly hys Aan Thr Leu Trp Cln Val Cly Thr Asp His Ala Gly He Ala Thr Gin Met Val Val 

70 80 90 

370 400 430 

GAG CGC AAG ATT GCC CCA CAA GAA GGT AAA ACC CGT CAC GAC TAC CGC CCC GAA GCT TTC ATC CAC AAA ATC TGG GAA TGG AAA GCC CAA 
Glu Arg Lyii He Ala Ala Glu Giu Gly Ly» Thr Arg His A«p Tyr Gly Ala Glu Ala ?he He Asp Lys lie Trp Glu Trp Lys Ala Glu 

100 HO 120 

460 490 520 

TCT CGC CGC ACC ATT ACC CGT CAG ATC CGC CGT CTC GGC AAC TCC CTC GAC TGG CAG CGT GAA CGC TTC ACC ATG GAC GAA GCC CTC TCC 
Ser Gly Gly Thr He Thr Arg Gin Met Arg Arg Leu Cly Asn Ser Val Asp Trp Clu Arg Clu Arg Phe Thr Met Asp Glu Gly Leu Ser 

130 UO 150 

550 580 tW 

AAT GCC CTC AAA GAA GTT TTC CTT CGT CTG TAT AAA GAA GAC CTG ATT TAC CGT GGC AAA CGC CTG GTA AAC TGG CAT CCG AAA CTC CGC 
Asti Ala Val Lys Glu Val Phe Val Arg Leu Tyr Lys Glw Aap Leu He Tyr Arg Gly Lys Arg Leu Val Asn Trp Asp Pro Lys Leu Arg 

160 170 180 

640 670 700 

ACC CCT ATC TCT CAC CTG GAA CTC GAA AAC CGC CAA TCC AAA GGT TCG ATG TGC CAC ATC CCC TAT CCG CTC CCT GAC CGT GCG AAA ACC 
Thr Ala He Ser Aap Leu Clu Val Glu Aan Arg Glu Ser Lys Gly Ser Met Trp His He Arg Tyr Pro Leu Ala Asp Gly Ala Lys thr 

190 200 210 y 

730 760 790 ^ 

GCA GAC GGT AAA GAT TAT CTG GTC CTC CCC ACT ACC CCT CCA CAA ACC CTG CTG GGC GAT ACT OCC GTA GCC CTT AAC CCG CAA GAT CCG ^ 
Ala Asp Gly Lys Asp Tyr Leu Val Val Ala Thr Thr Arg Pro Glu Thr Leu Leu Cly Asp Thr Cly Val Ala Val Asn Pro Glu Asp Pro O 

220 230 240 ^ 

820 650 B80 CD 

CGT TAC AAA CAT CTG ATT CGC AAA TAT GTC ATT CTG CCG CTG GTT AAC CGT CCT ATT CCG ATC GTT GGC GAC CAA CAC CCC GAC ATC GAA Q- 
Arg Tyr Lys Aap Leu He Cly Lys Tyr Val He Leu Pro Leu Val Asn Arg Arg He Pro He Val Gly Asp Glu His Ala Asp Met Glu ^ 

250 260 270 ^ 

910 940 970 

AAA GCC ACC GCC TGC GTC AAA ATC ACT CCG GCC CAC GAC TTT AAC CAC TAT GAA CTG GGT AAA CGTT CAC CCC CTG CCG ATG ATC AAC ATC J 
Lys Gly Thr Gly Cys Val Lys He Thr Pro Ala His Asp Phe Aso Aap Tyr Glu Val Gly Lys Arg His Ala Leu Pro Met He Asn He g 

280 290 300 ^ 

1000 1030 1060 

etc ACC TTT GAC GCC GAT ATC CGT CAA AGO GCC CAC CTG TTC CAT ACC AAA GGT AAC GAA TCT GAC CTT TAT TCC AGC GAA ATC CCT GCA P 
Leu Thr Phe Asp Gly Aap He Arg Clu Ser Ala Cln Val Phe Asp Thr Lys Gly Aan Glu Ser Asp Val Tyr Ser Ser Clu He Pro Ala ^ 

310 320 330 ^ 

1090 1120 U50 ^ 

GAG TTC CAG AAA CTC CAG CGT TTf GCT GCA CGT AAA CCA GTC GTT GCC CCA GTT GAC CCG CTT GGC CTG CTG CAA CAA ATT AAA COJ CAC — 
Glu Phe Cln Lys Leu Clu Arg Phe Ala Ala Arg Lys Ala Val Val Ala Ala Val Asp Ala Leu Cly Leu Leu Glu Clu He Lys Pro His ^ 

340 350 360 ^ 

IIBO 1210 1240 

GAC CTG ACC GTT CCT TAC GGC GAC CGT CGC OCC GTA GTT ATC GAA CCA ATG CTG ACC GAC CAC TGC TAC GTG CGT GCC GAT GTC CTG CCG ^ 
Asp Leu Thr Val Pro Tyr Gly Aap Arg Gly Gly Val Val He Glu Pro Ket Leu Thr Asp Cln Trp Tyr Val Arg Ala Aap Val Leu Ala -H 

370 380 390 

1270 1300 1330 > 

AAA CCG GCG GTT CAA GCG GTT GAC AAC GCC GAC ATT CAG TTC GTA CCC AAG CAG TAC GAA AAC ATG TAC TTC TCC TGC ATG CGC GAT ATT ^ 
Lys Pro Ala Val Glu Ala Val Glu Asn Gly Asp He Gin Phe Val Pro Lys Gin Tyr Glu Asn Met Tyr Phe Ser Trp Ket Arg Asp He ^ 

400 ^10 ^20 CZ 

1360 1390 1*20 ■:X} 

CAG GAC TGC TGT ATC TCT CCT CAG TTG TGG TGG CGT CAC CGT ATC CCG GCA TGG TAT GAC CAA GCG GCT AAC GTT TAT GTT GGC CGC AAC ^ 
Gin Asp Trp Cys He Ser Arg Gin Leu Trp Trp Gly His Arg He Pro Ala Trp Tyr Asp Glu Ala Gly Asn Val Tyr Val Gly Arg Asn ^ 

430 *40 *50 o 

1450 1A80 1510 ^ 

GAA GAC GAA GTC CCT AAA GAA AAT AAC CTC GGT GCT GAT GTT GTC CTG CGT CAG GAC GAA GAC GTT CTC GAT ACX TO! TTC TCT TCT CCC 

Glu Asp Glu Val Arg Lys Glu Asn Asn Leu Gly Ala Asp Val Val Leu Arg Cln Asp Clu Asp Vai Leu Asp Thr Trp Phe Ser Ser Ala X 

460 ^70 *80 m 

1540 1570 1600 ^ 

CTG TGG ACC TTC TCT ACC CTT CGC TGC CCG GAA AAT ACC GAC CCC CTG CGT CAG TTC CAC CCA ACC AGC CTG ATG GTA TCT GGT TTC GAC ^ 
Uu Trp Thr Phe Ser Thr Leu Cly Trp Pro Glu Asn Thr Asp Ala Leu Arg Gin Phe His Pro Thr Ser Val Met Val Ser Gly Phe Asp 

490 SOO =f 

1630 1660 1690 C„ 

ATC ATT TTC TTC TGG ATT GCC CGC ATG ATC ATC ATG ACC ATC CAC TTC ATC AAA CAT GAA AAT GCC AAA CCG CAG GTC CCC TTC CAC ACC C 
He He Phe Phe Trp He Ala Arg Met He Met Met Thr Met Bia Phe He Lys Asp Glu Asn Gly Lys Pro Gin Val Pro Phe His Thr ^ 

520 5 30 540 to 

1720 1750 1780 M 

CTT TAC ATG ACC GGC CTG ATT CGT GAT GAC GAA GGC CAC AAG ATG TCC AAA TCC AAG GGT AAC GTT ATC GAC CCA CTG GAT ATG GTT CAC O 
Val Tyr Met Thr Gly Leu He Arg Asp Asp Clu Gly Gin Lys Met Ser Lys Ser Lye Gly Asn Val He Asp Pro Leu Asp Met Val Asp ^ 

550 560 570 

1810 l^^O 1870 

CCT ATT TCG CTG CCA GAA CTC CTG GAA AAA CGT ACC GCC AAT ATG ATG CAG CCG CAG CTC GCG CAC AAA ATC CGT AAG CGC ACC GAC AAG 
Gly He Ser Leu Pro Glu Leu Leu Glu Lys Arg Thr Gly Asn Met K«t Glu Pro Cln Leu Ala Asp Lys lie Arg Lys Arg Thr Clu Lys 

5g0 ^90 600 

1900 1930 I960 

CAG TTC CCG AAC CCT ATT CAC CCC CAC OCT ACT CAC CCG CTC CGC TTC ACC CTG CCC GCC CTG CCG TCT ACC CGT CCT GAC ATC AAC TGG 
Gin Phe Pro Aan Cly He Clu Pro «!• CXy Thr A*p Ala Leu Arg Phe Thr Leu Ala Ala Leu Ala Ser Thr Gly Arg Aap He Aan Trp 

610 620 630 

1990 2020 2050 

GAT ATG AAG CGT CTC GAA GCT TAC CGT AAC TTC TGT AAC AAG CTC TCG AAC ACC CGC TTT CTG CTG ATG AAC ACA CAA GGT CAG GAT 

Asp Ket Lys Arg Leo Clu Cly Tyr Arg Aan Fhe Cya Aan Lys Leu Trp Aan Ala Ser Arg Phe Val Leu Het Aan Thr Glu Gly Gin Aap 

640 650 660 

2080 2110 2140 

TGC GCC TTC AAC GGC CGC GAA ATG ACG CTG TCG CTG GCG CAC CGC TGC ATT CTC GCG GAC TTC AAC CAG ACC ATC AAA GCC TAC CGC GAA 
Cys Gly Phe Asa Cly Gly Glu Met Thr Leu Ser Uu Al* Aap Arg Trp He Uu Ala Glu Phe Aan Gin Thr He Lys Ala Tyr Arg Glu 

670 680 690 

2170 2200 2230 

GCG CTC GAC ACC TTC CGC TTC CAT ATC GCC CCA GCC ATT CTG TAT CAG TTC ACC TGG AAC CAG TTC TGT GAC TCG TAT CTC GAG CTG ACC 
Ala U« Aap Ser Phe Arg Phe Aap He Ala Ala Cly He Leu Tyr Glu Phe Thr Trp Aan Gin Phe Cys Aap Trp Tyr Uu Glu Uu Thr 

700 710 '^^ 

Fig. 2. The nucleotide sequence and deduced amino acid sequence of the nontranscribed DNA strand 
of the vols gene of E, coli K12* The complete nucleotide sequence of the valS gene, beginning with the 
translational start codon (ATG) at nucleotide position +93, is listed. The nucleotide numbering is relative to the 
start of transcription (19). The predicted amino acid sequence of valyl-tRNA synthetase, the valS gene product, is 
listed immediately below the nucleotide sequence with the residues numbered from the start of translation. The 
sequence of the first deduced 10 amino acids agrees with the determined amino-terminal sequence of the enzyme. 
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2260 
AAG CCC GtA ATG 
tys Pro Val Met 

2350 
CAT CCG ATC AIT 
His Pro He Xle 

2A40 
TIC CCG CAC TAC 
Phe Pro Gin Tyr 

2530 
COT GCA GAA ATG 
Arg Ala Glu Met 

26 20 
CCT GCC TTC CTG 
Arg Gly Phe teu 

2710 
GAC GGT GCA GAG 
Asp Gly Ala Glu 

2800 
GGT GAA ATC AGO 
Giy Glu He Scr 

2&90 
CTG GAA GGC TAT 
Leu Glu Gly Tyr 



AAC GCT GGC ACC GAA 
Asn Gly Gly Thr Glu 



CCG TTC ATC ACC GAA 
Pro Pile He Thr Glu 



CAT GCA TCT CAG GTT 
Amp Ala Ser Gin Val 



AAC ATC GCG CCG GGC 
Aan He Ala Pro Gly 



CAA ACC CTG GCG CGT 
Gin Thr Leu Ala Arg 



CTG CTG Ate CCG ATG 
Leu t«« He Fro Met 



CGT ATC GAG AAC AAA 
Arg He Glu Aan Lys 



GCG CAA GCG AAA GCG 
Ala Glu Ala Lys Ala 



CCA GAA 
Ala Glu 

730 

ACC ATC 
Thr He 
760 

CAT GAA 
Asp Glu 
790 

AAA CCC 

Lys Pro 
820 

CTG CAA 
Leu Glu 

GCT GGC 
Ala Gly 
880 

CTG GCG 
Leu Ala 
910 

AAA CTG 
Lys Leu 
940 



2290 
CTG CGC 
Leu Arg 

2380 
TGC CAG 
Trp Gin 

2470 
GCC GCA 
Ala Ala 

2360 
etc GAG 
Leu Clu 

2650 
ACT ATC 
Ser He 

2740 
CTC ATC 
Leu He 

2830 
AAC CAA 
Aan Clu 

2920 
ATT CAA 
He Glu 



GGT ACT 
GXy Thr 



CCT CTG 
Arg Val 



CTG CCC 
Leu Ala 



CTG CTG 
Leu. Leu 



ACC c*rG 
Thr Val 



AAC AAA 

Aan Lys 



CGC TTT 
Gly Phe 



CGC CAT 
Arg His 



AAA GTA 
Lya Val 



CAC ACC 
Asp Thr 



CTG CGT 
Leu Arg 



CTG CCT 
Leu Pro 



GAA CAT 
Glu Asp 



GTC GCC 
Val Ala 



ACG CTG 
Thr Leu 



err TGC 

Leu Cys 



GAA TGG 
Glu Trp 



GGT TGC 
Gly Cy» 



GCC GAT 
Ala Aap 



GAG etc 
Giu Leu 



CGC GCA 
Arg Ala 



CAG CAG 
Gin Gin 



GOT GTT 
Ala Val 



ATC GCC 
He Ala 



GTG 
Val 
740 

OCT 
Gly 
770 

CTG 
Leu 
800 

AGC 
Ser 
830 

CAC 
Asp 
860 

CCC 
Ala 
890 

CCG 
Pro 
920 

CCG 
Ala 
950 



2320 

ACT GTA CTC GAA GCT CTC 
Thr Val Leu Glu Gly Leu 

2410 

ATC ACT GCC GAC ACC ATC 
He Thr Ala Asp Thr He 

2500 

AAA CAC GCC ATC GTT GCG 
Lys Gin Ala He Val Ala 

2590 

GCG CAT CGA CAA CCT CCC 
Ala Aap Arg Glu Arg Arg 

2680 

AAA GGT CCG GTT TCC GTT 
Lys Giy Pro Val Ser Val 

2770 

CGT CTG GCG AAA GAA GTG 
Arg Leu Ala Lys Glu Val 

2860 

GAA GCG GTC ATC GCG AAA 
Glu Ala Val He Ala Lys 

2950 
CTG TAA 
Leu • 



CTG CGC CTC GCG 
Leu Arg Leu Ala 
750 

ATG CTG CAG CCG 
Met Leu Gin Pro 
780 

GTA CGT AAC ATC 
Val Arg Asn He 
810 

GTA AAT GAA AAC 
Val Asn Glu Asn 
840 

ACG AAC ATC ATC 

Thr Lys He He 
870 

GCG AAG ATT GAA 
Ala Lys He Glu 

900 

GAG CGT GAG AAG 
Glu Arg Clu Lys 
930 



Fig, 2— continued 
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Tabi-e I 

Codon frequency usage of the valS gene cts compared with codon mage in 25 E. coli genes 
E. coli codon usage compilation from Konigsberg and Godson (21). 



Residue and 
codon 



Codon frequencies* 



ValS 



E. coli 



Residue and 
codon 



Codon frequencies'* 



ValS 



B. coli 



PheUUU 


6 (17) 


104 (44) 


TyrUAU 


13 (50) 


69 (41) 


PheUUC 


29 (83) 


135 (56) 


TyrUAC 


13 (50) 


101 (59) 


Leu UUA 


0(0) 


36 (6) 


Ter UAA 


1 


22 


Leu UUG 


2(3) 


51 (8) 


Ter UAG 


0 


1 


Leu CUU 


4(5) 


54 (9) 


Ter UGA 


0 


2 


Leu cue 


6(8) 


41 (7) 








Leu CUA 


0(0) 


11 (2) 


His CAU 


3(18) 


42 (39) 


Leu cue 


67 (85) 


432 (69) 


His CAC 


14 (82) 


66 (61) 


He AUU 


19 (30) 


151 (37) 


Ghi CAA 


3(8) 


75 (27) 


lie AUG 


44 (70) 


252 (62) 


Gin CAG 


33 (92) 


207 (73) 


He AUA 


0(0) 


2(1) 














Asn AAU 


7(18) 


57 (24) 


Met AUG 


32 


189 


Asn AAC 


33 (83) 


179 (76) 


Val GUU 


25 (41) 


182 (38) 


Lys AAA 


38 (72) 


296 (77) 


Val GUC 


12 (20) 


62 (13) 


Lys AAG 


15 (28) 


90(23) 


Val GUA 


10 (16) 


111 (23) 








ValGUG 


14 (23) 


130 (27) 


Asp GAU 


25 (38) 


175 (51) 








Asp GAC 


40 (62) 


168 (49) 


Ser UCU 


10 (31) 


86 (27) 








Ser UCC 


7 (22) 


83 (26) 


Glu GAA 


63 (79) 


328 (73) 


Ser UCA 


0(0) 


27 (8) 


Glu GAG 


17 (21) 


119 (27) 


Ser UCG 


4(13) 


37 (U) 








Ser AGO 


3(9) 


21 (6) 


Cys UGU 


3 (38) 


21 (42) 


Ser AGC 


8 (25) 


70 (22) 


Cys UGC 


5 (63) 


29(58) 


Pro ecu 


4(10) 


24 (9) 


Trp UGG 


24 


48 


Pro CCC 


0(0) 


16 (6) 








Pro CCA 


6(14) 


53 (20) 


Arg CGU 


37 (61) 


201 (58) 


Pro CCG 


32 (76) 


174 (65) 


Arg CGC 


23 (38) 


121 (36) 








Arg CGA 


1(2) 


8(2) 


Thr ACU 


8(15) 


76 (24) 


Arg CGG 


0(0) 


11 (3) 


Thr ACC 


40 (76) 


162 (51) 


Arg AGA 


0(0) 


4(1) 


Thr ACA 


2(4) 


19 (6) 


Arg AGG 


0(0) 


1 (0.25) 


Thr ACG 


3(6) 


63 (20) 














Gly GGU 


27 (41) 


231 (48) 


Ala GCU 


8(10) 


202 (28) 


Gly GGC 


38 (58) 


197 (41) 


Ala GCC 


20 (26) 


136 (19) 


Gly GGA 


0(0) 


22 (5) 


Ala GCA 


14 (18) 


166 (23) 


Gly GGG 


1 (2) 


33 (7) 


Ala GCG 


36 (46) 


221 (30) 














Total 


951* 


6,478*- 



o 



CO 



5 

H 

m 
c 

m 
g 

> 

-H 
X 

m 
O 
c 



ho 
O 
o 



Numbers represent times codon used in genes. Numbers in parentheses represent the percentage of codon 
usage for the respective amino acid within valS or the 25 grouped E. coli genes. 
* Total codons minus translational stop codon (s). 
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£. gQli 1- MEKTYNPQDIEOPLYEHWEKQGyr KPNGDES QE 

yeast 144- A'LKGYNPANVESSWyDWWIKTGVFEPEFTADGKVKPE 



£. ^f^lt 46- GSLHMGHAFQQTIMDTMIRYQRMOGKNTLWQVGTDHAC 

yeast 1»4- GAlHIGBALTIAIOI>St.XRyHSMKGKTVLFLPGFDBAG 



£- eoli 86- AEEGKTRHDYGAEAFIDKXWEWKAESGGTITRQMRRLGN 

yeast 244- AKDRKTRHDyGREAFVGKVWEWKEElfHSRIKNQIQKLGA 

£. £s2li 146- DEGLSNAVKEVFVRLYKEDLIYRGKRLVNWDPKLRTAIS 

yeast 294- SPELTKSVEEAPVKLHDEGVIYRASRLVHWSVKLNTAIS 

E - .CyiU t96- SMWHIRYPLADG AKTADGKDY LVVATTR 

yeast 344- RTLLSV PGYDEKVEFGVLTSFAYPVIGSDEKLIIATTR 



yeast 



yeast 



yeast 



£. QSOl 
yeast 



yeast 



yeast 
yeast 
yeast 



yeast 

£. mU FNDYEVGKRHALPMINILTFDGDIRESAQVFDTKGNESD 
yeast 443- QNDYMTGKRttNLEFINILTDDGLLNEECGPEWQGMKRFD 

O 

goll 334- KLE RFAARKAVVAAVDALGLLEEIKPHDLTVPYGDRGGVVIEPMLT0Q ^ 

yeast 490- LKEKNLYVGOEDN EMTIPTCSRSGDIIEPLI,KPQ 3 

O 



E - :CaIi 3»2- (fYVRADVLAKPAVEAVENGOXQFVPKQYENMYFSWMROIQDWCISRQLWW 

yeast 524- WWVSQSEKAKDAIKVVKDGQITITPKSSEAEYFHWLGNIQDWCISRQLHIf 

E . riQli 432- GHRIPAWYDEAGNVYVGRKEDEVR KENNLGA 

yeast 574- GHRCPVYFINIE GEEHDRIDGDYWVAGRSMEEAEKKAAAKYPNSKF 
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Fig. 3. Primary seaaence homology alignment of valyl-tHNA synthetase of coti and the yeast 
cytoplasmic valyl-tKNA synthetase of S, cerevisiue. The deduced primary sequences of the valS gene of E. 
coli and the VASI gene of S. cerevisiae, both encoding valyl-tRNA synthetase, are aligned (7), Identical amino 
acids are indicated by a filled circle located immediately above identical residues within the two aligned sequences. 
The numbers to the left of each line give the residue position number of the first amino acid listed relative to the 
start of each respective primary sequence. With regard to previously identified or proposed functionally equivalent 
catalytic and/or binding residues, there is a 12/14 match found at the consensus HIGH region {E. coli sequence 
positions 40-53), a 16/17 match is found at the DW^ISRQL consensus sequence {E. coli sequence positions 420- 
436) and a 14/16 match at the proposed KMSK consensus site thought responsible (40) for binding the 3 '-end of 
the tRNA {E, coli sequence position 552-567). An additional region of sustantial homology has a 22/24 direct 
correspondence {E. coli sequence positions (468-491). The carboxyl terminus of E. coli primary sequence is 
indicated by an asterisk, 

apparently having diverged more throughout evolution. It is terminal and 40 residues at the carboxyl-terminal (Fig. 3). 
of interest to note that the yeast valyl-tRNA synthetase Whether or not these addition sequences in yeast are due 
enzyme has additional sequence elements located at both merely to species differentiation or might serve some addi- 
termini that apparently are not found within the equivalent tional functional role is conjecture at this time, 
bacterial enzyme, approximately 140 residues at the amino- As previously mentioned, the strongest overall primary 
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Fig. 4. Primary sequence homol- 
ogy aii^ment of the val- 
yl-, isoleucyl-, and methionyl-tHNA 
synthetases of coU and the yeast 
mitochondrial leucyl-tRNA synthe- 
tase of cerevisiae. The deduced pri- 
mary sequence of the valS gene encoding 
valyl-tRNA synthetase (ValRS) is 
aligned with the primary sequences of 
isoleucyl" (lieRS) and methionyl-tRNA 
synthetases (MetRS) of E. coli along 
with the primary sequence of the yeast 
mitochondrial leucyl-tRNA synthetase 
iteuRS) (Refs. 29» 30 and 36, respec- 
tively). Identical amino acid residues are 
boxed if two or more sequences have a 
common residue at the same alignment 
position. Twelve regions of substantial 
homology which exist between valyi- 
tRNA synthetase and isoleucyl -tRN A 
synthetase are indicated by stippling in 
both sequences. The carboxyl terminus 
of each respective primary sequence is 
Indicated by an asterisk. 
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Fig, 4 — continued 
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sequence homology observed in pairwise comparisons between 
heterologous aminoacyl-tENA synthetases is detected when 
comparing the vaiyi-tRNA synthetase and isoleucyl-tRNA 
synthetase primary sequences of E. colL Sequence homology 
comparisons between the valyl- and isoleucyl-tRNA synthe- 
tase enzymes, utilizing the depicted optimum valyl-, isoleu- 
cyl-, methionyl-tRNA synthetase/yeast mitochondrial leucyl- 
tRNA synthetase alignment (Fig. 4), show an overall 19.2% 
direct amino acid identity per unit length and a 41.0% "chem- 
ical equivalent" amino acid identity per unit length. While 
gapping was introduced to allow for insertions or deletions 
present within the four individual synthetases, these values 
are in close agreement with values obtained from the optimal 
alignment found when only the vaiyl-tRNA synthetase and 
isoleucyl-tRNA synthetase enzymes were compared (20% di- 
rect identity and 43.2% similar identity; alignment not 
shown). Both the comprehensive alignment of the four 
branched-chain aminoacyl-tRNA synthet^uses and the pair- 
wise comparison between valyl- and isoleucyl-tRNA synthe- 
tase were substantially based on the alignment of 12 short 
blocks of relatively strong homology (>80% at the level of 
"chemically equivalent" amino acids; regions stippled in Fig, 
4) that exist primarily between the closely related valyl- and 
isoleucyl-tRNA synthetase enzymes but are in many cases 
common to both, or one of, the remaining two synthetases, 
methionyl-tRNA synthetase and yeast mitochondrial leucyl- 
tEN A synthetase. The preservation of all, or a subset thereof, 
of these 12 conserved regions within the valyl-, methionyl-, 
isoleucyl-, and leucyl-tRNA synthetases strongly suggests 
that these functionally related enzymes may also form an 



evolutionaxiiy related family within the aminoacyl-tRNA syn- 
thetases by virtue of having diverged from a common ancestral 
gene. The fact that these 12 homologous pockets still remain 
within this group of most ancient of proteins suggests that 
these regions possibly represent functionally related blocks 
within this family of aminoacyl-tRNA synthetases. 

Significantly, these 12 sequentially ordered regions of sub- 
stantial homology are spread out along the entire lengths of 
both valyl- and isoleucyl-tRNA synthetases (Fig, 4). This 
finding contrasts sharply with the results of all but one (7) of 
a number of previous alignment studies, where the limited 
amount of primary sequence homology observed in pairwise 
comparisons between heterologous aminoacyl-tRNA synthe- 
tases, if observed, was found to occur predominately within 
the amino proximal halves of these enzymes where the cata- 
l3^ic core domain is believed to reside (40). 

As previously mentioned, a subset of the same 12 regions 
found common to both valyl-tRNA synthetase and isoleucyl- 
tRNA synthetase are also regions of substantial shared ho- 
mology that are observed in a pairwise comparison between 
the deduced primary structures of valyl-tRNA synthetase and 
methionyl-tRNA synthetase of colL However, in this case 
the six regions of substantial homology occur primarily within 
the amino-terminal half of the 677-amino acid long methio- 
nyl-tRNA synthetase. Several of the 12 sequentially ordered 
blocks found common to both valyl-tRNA synthetase and 
isoleucyl-tRNA synthetase do not appear to have substan- 
tially homologous regions within the methionyl-tRNA syn- 
thetase primary sequence (£.e. regions 2 and 4, valyl-tRNA 
synthetase residue numbers 180-189 and 272-282; Fig. 4). As 
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Fig 5. Schematic representation of the primary sequence alignment between valyl-, isoleucyl-, and 
methionyl-tRNA synthetases of E, coli and mitochondrial leucyl-tRNA synthetase of S, cereviaiae as 
depicted in Fig. 4. A schematic representation is presented to illustrate homologous regions and gapping that 
was introduced to achieve the alignment between valyl-, isoleucyl-, and methionyl-tRNA synthetases of B. coli and 
mitochondrial leucyl-tRNA synthetase of cerevisiae presented in Fig. 4. The rectangles along each line represent 
scaled contiguous sequence elements that exist along the primary structure of each of the four aligned proteins. 
The connecting line between the rectangles indicates the extent of the gaps introduced to bring homologous regions 
into alignment. Indicated by either specific geometric patterns {thick horizontal bars, HIGH; slanted hatched lines, 
DWfilSRQ, and thin horizontal lines, KMSK) or solid black within these rectangles are positions along each 
sequence that share sequence homology with the 12 substantial homologous regions which exist between valyl- 
and isoleucyl-tRNA synthetase (Fig. 4), The thick bars highlight regions missing in the sequence immediately 
above. Scale below gives overall position of the homologous blocks relative to the alignment of Fig. 4. The overall 
lengths of the individual synthetases are given immediately above and on the right. 



illustrated schematically in Fig. 6, the latter of these two 
mentioned regions falls within an area common to the primary 
sequences of the other three synthetases but not present 
within the methionyl-tRNA synthetase primary sequence. 
These approximately 220-residue long inserts common to 
valyl-, isoleucyl-, and leucyl-tRNA synthetase appear to be 
the result of an extended connective polypeptide region not 
present within the region of the methionyl-tRNA synthetase 
primary sequence that forms the mononucleotide-binding fold 
believed to be involved in binding methionyl adenylate (41), 
The methionyl-tRNA synthetase primary sequence addition- 
ally lacks a smaller region of approximately 100 residues 
located close to the block (DWCISRQ) common to all four 
synthetases that immediately follows the previously defined 
missing region (Fig. 5). The overall sequence homology ob- 
served between valyl-tRNA synthetase and methionyl-tRNA 
synthetase, utilizing the alignment indicated in Fig. 4, is 
20.2% similar and 9.5% direct identity. 

In comparisons between the unrelated heterologous yeast 
mitochondrial {S. cerevisiae) leucyl-tRNA synthetase (39) and 
valyl-tRMA synthetase of E. coli, it is observed that a number 
of the previously defmed blocks of shared homology, which 
exist between valyl-tRNA synthetase and isoleucyl-tRNA 
synthetase are also common to the yeast leucyl-tRNA synthe- 
tase (Fig. 4). Furthermore, the yeast leucyl-tRNA synthetase 
enzyme possesses regions that are homologous to the extended 
connective polypeptide region present in valyl-tRNA synthe- 
tase and isoleucyl-tRNA synthetase enzymes but not found 
as previously mentioned within the methionyl-tRNA synthe- 
tase enzyme. However, as illustrated in Fig. 5, there is a span 
of approximately 170 residues in length that is lacking in the 
primary sequence of the yeast sequence but present to greater 
or lesser extents within the bacterial primary sequences. The 
values obtained in pairwise comparisons between the primary 
structures of valyl- and leucyl-tRNA synthetase, based on the 
alignment of Fig. 4, are 27.1% similar and 14.6% direct 
identity. 



Taken together, the observed degree of relatedness between 
these four enzymes lends credence to the supposition initially 
advanced by Wetzel (39) that valyl-, isoleucyl-, leucyl-, and 
methionyl-tRNA synthetase are all members of a family 
within the aminoacyl-tRNA synthetases. Based on the se- 
quence homologies that exist within this proposed family the 
following evolutionary linkages are consistent: methionyl- 
tENA synthetase -* leucyl-tRNA synthetase ^ isoleucyl- 
tRNA synthetase and valyl-tRNA synthetase (Fig. 7). 

With the exceptions of the previously mentioned isoleucyl - 
and methionyl-tRNA synthetase of E coli and the leucyl- and 
valyl-tRNA synthetase of yeast, comparisons between the 
primary structure of bacterial valyl-tRNA synthetase and the 
primary sequences encoded by the other listed genes (results 
not shown) detected only limited homology of a much lesser 
degree than was detected in comparisons with methionyl- 
tRNA synthetase. While no extensive segments of overall 
homology exist, there are several short blocks of equivalent 
homology that valyl-tRNA synthetase has in common with a 
number of the other synthetases. The more substantially 
shared regions of chemical equivalent amino acid homology 
which exist between these heterologous enzymes are illus- 
trated in Fig. 6, The sequence homologous to the consensus 
or identity sequence HIGH, which is located near to the the 
amino-terminal end in all the aminoacyl-tRNA synthetases 
possessing this region and believed to play a role in the binding 
of ATP or the adenyl part of the adenylate intermediate (43), 
is also found in the valyl-tRNA synthetase primary sequence 
(Fig- 5). As illustrated in Fig. 6, the valyl-tRNA synthetase 
primary sequence differs in the most commonly variable res- 
idue of the consensus sequence with the chemically conserved 
substitution of a methionine residue for an isoleucine residue 
of the HIGH consensus sequence. The valyl-tRNA synthetase 
primary sequence also possesses a region of homology shared 
with many other synthetases which is believed to be involved 
with binding the 3 '-end of the tRNA molecule (44). As illus- 
trated in Figs. 4 and 5, this consensus sequence KMSKS is 
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Fig. 6. Amino acid sequence homology between valyl-tRNA 
synthetase {VaiRS) and other known aminoacyl-tKNA syn- 
thetase primary sequence regions. The corresponding aligned 
primary sequence regions show relatively strong identical and chem- 
ically equivalent amino acid homology with valyi-tKNA synthetase 
when allowing for minimal gapping. The number to the left of each 
sequence refers to the first listed residue position relative to the start 
of translation for each respective synthetase (Refs. listed in text). 
Dashes indicate gaps in sequence to maximize alignments. All syn- 
thetase sequences are E. colt derived unless otherwise noted. Abbre- 
viations: GlnBS^ glutaminyl-; GluRS^ glutamyl-; IleHS, isoleucyl; 
LeuRS, leucyl-; MetRS, methionyl-; SerRS, seryl-; TrpRS, trypto- 
phanyl-; and TyrRS, tyrosyl-; (Bs, B, stearothermophilus; ym, yeast 
mitochondrial, or yc, yeast cytoplasmic (S. cereuistoe)). 

found in all four members of the proposed family, even though 
the enzymes are of prokaryotic and eukaryotic origins. An- 
other region of previously unreported homology, which is 
common to ail except isoleucyl-tRNA synthetase of the pro- 
posed family along with several other aminoacyl-tRNA syn- 
thetases, is also indicated in Fig, 4. Finally, there appears to 
be a region of sequence homology that is common only to the 
proposed family of branched-chain aminoacyl-tRNA synthe- 
tases. As illustrated in Fig. 4, this consensus sequence 
DWCISRQ, which in the case of isoleucyl-tRNA synthetase 
has been shown to possess an iV-ethyimaleimide reactive 
cysteine that preferentially inactivates the isoleucyl-tRNA 
synthetase enzyme, is exactly identical to the homologous 
region present within in the valyl-tRNA synthetase enzyme 
that possess a 9 out of 11 residues direct identity with the 
isoleucyl-tRNA synthetase enzyme sequence (32). 



DISCUSSION 

In general, previous pairwise comparisons directed at un- 
covering possible sequence similarities between unrelated het- 
erologous aminoacyl-tRNA synthetases, in the hope of defin- 
ing domains involved in the binding of substrates and/or 
catalysis, have not revealed any extended regions of similarity; 
however, several synthetase pairs showed a number of short 
regions (6-14 residues) of statistically significant similarities 
(41). This is not surprising considering the fact that these 
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Fig. 7. Depiction of the evolutionary relatedness occurring 
between the valyl-, isoleucyl-, and methionyl-tRNA synthe- 
tases of coli and yeast mitochondrial leucyl-tRNA synthe- 
tase. Based on the percent sequence homology values obtained from 
analyses of the alignment depicted in Fig, 4, the degree of "evolution- 
ary relatedness** occurring between the four compared aminoacyl- 
tRNA synthetases is presented. The drawing depicts an **unrooted" 
tree (cladogram) with branch lengths drawn proportionally on the 
vertical axis to depict the evolution of these related sequences, isoleu- 
cyl-tRNA synthetase {lieRS) and valyl- {ValRS\ leucyl- (LeuRS), 
and methionyl-tRNA synthetases {MetRS) (46). 

enzymes are representative of some of the most ancient of all 
proteins, which implies that these enzymes have experienced 
extensive multiple evolutionary replacements. Imposed on top 
of this background, the existence of additional domains that 
have been implicated in functions other than catalysis, such 
as subunit interaction or specific regulatory functions, means 
that only a portion of some of the primary structures of these 
enzymes can be reasonably expected to exhibit extensive 
homology (37). With these two caveats in mind it was some- 
what unexpected to find the degree of chemically equivalent 
homology that exists between the branched-chain aminoacyl- 
tRNA synthetases. The observed 19,2% direct identity per 
unit length which exists between E. coli valyl- and isoleucyl- 
tRNA synthetases initially appears to be only moderately 
significant. However, when the length of the two primary 
structures are taken into account the authenticity of the 
relationship between valyl-tRNA synthetase and isoleucyl- 
tRNA synthetase is highly significant (42). In fact, when the 
average alignment score was computed from sets of scrambled 
sequences (whose compositions and lengths were both iden- 
tical to valyl-tRNA synthetase and isoleucyl-tRNA synthe- 
tase and subjected to the same alignment and scoring proce- 
dure used for the authentic valyl/isoieucyl-tRNA synthetase 
alignment), it was found that the alignment score obtained 
for the genuine sequences was more than 9.0 standard devia- 
tions above the scrambled comparison average.^ Moreover, 
when chemically equivalent amino acids are scored with the 
alignment shown in Fig. 4 there is a 41.0% direct correspond- 
ence between the primary sequences of valyl- and isoleucyl- 
tRNA synthetase. Clearly, the observed similarity between 
these two aminoacyl-tRNA synthetases is not due to chance 
but rather represents a genuine common ancestry. Addition- 
ally, the degree of relatedness is quite high among all individ- 
ual pairings of the four synthetases (see Fig. 7) that comprise 
this proposed evolutionarily related family of synthetases 
(39). The fact that many of the ordered regions of substantial 
homology depicted in Fig. 4 are common to all four branched- 
chain aminoacyl-tRNA synthetases (Fig. 5) indicates these 
synthetases are more closely related than other known syn- 
thetase groupings and that these remaining pockets of se- 
quence similarity possibly represent functionally important 
segments contributing to the function of these heterologous 
enzymes. 

As previously mentioned and illustrated in the schematic 
of Fig, 5, the valyl-, isoleucyl-, and leucyl-tRNA synthetase 
enzymes have an additional peptide loop separating domains 
that are equivalent to regions of the methionyl-tRNA synthe- 
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tase enzyme believed to be involved in the formation of the 
adenylate-binding fold (38), It would be of interest to see if 
presence of these extented connnective polypeptide regions is 
required for formation of the adenylate or the subsequent 
aminoacylation of tRNA. Isolation of genetically engineered 
deletion mutants spanning this region should provide infor- 
mation about the structural function of this extented connec- 
tive polypeptide domain. 

It should be noted that while short regions of significant 
homology exist within the carboxyl thirds of these enzymes, 
specifically between valyl- and isoleucyl-tRNA synthetases 
(Fig. 3), the fact remains that more extensively shared ho- 
mology is present within the amino proximal thirds of these 
enzymes. This is taken to mean that within these amino 
proximal regions, which are quite homologous to the methio- 
nyl-tRNA synthetase and thus comparable to the Bacillus 
Btearothermophilus tyrosyl-tRNA synthetase by x-ray diffrac- 
tion studies (45), that the tertiary structures of these four 
synthetases should be quite homologous, 

A systematic search for amino acid sequences that poten- 
tially could form metal-binding domains in nucleic acid-bind- 
ing proteins has identified such proposed sequences in several 
of the aminoacyl-tRNA synthetases, specifically both methi- 
onyl- and isoleucyl-tRNA synthetases of coli have been so 
identified (43). These sequences, of the form Cys-Xa-Cys- 
X^ie-Cys-Xa-Cys are thought to bind the one Zn^* ion which 
is found per polypeptide chain in both methionyi- and isoleu- 
cyl-tRNA synthetase proteins of coli (28, 29). A search of 
the deduced valyl-tRNA synthetase primary structure for 
sequences with 4 Cys or His residues arranged in a manner 
suggestive of a metal-binding domain found no corresponding 
sequences present within valyl-tRNA synthetase. This finding 
is in contrast to reports that all three thermostable valyl-, 
isoleucyl-, and methionyl-tRNA synthetases of Thermus ther- 
mophilus HB8 bind Zn^*^ ions (44). The proposed metal- 
binding domains of both isoleucyl- and methionyl-tRNA syn- 
thetase are located in distinctly different regions of their 
respective sequences. The proposed domain of the isoleucyl- 
tRNA synthet€^e enzyme is located proximal to the carboxyl 
terminus of the enzyme (isoleucyl-tRNA synthetase, residues 
902-925), while the proposed methionyl-tRNA synthetase 
metal-binding domain is located within the amino- terminal 
third of the enzyme (methionyl-tRNA synthetase, residues 
145-161), Therefore, it seems that if indeed functional, the 
proposed metal-binding domains found within these two en- 
zymes are representative of a more recent evolutionary acqui- 
sition since 1) these sequences are not found within the 
primary sequences of other closely related family members 
(Le, E, coli valyl-tRNA synthetase and yeast mitochondrial 
leucyi-tRNA synthetase and cytoplasmic valyl-tRNA synthe- 
tase and more importantly, 2) these proposed sequences are 
positioned in an order different than other ordered regions of 
shared sequence homology. This suggests that these proposed 
metal-binding domains were not present in the ancestral 
progenitor gene responsible for the methionyi-, leucyi-, val- 
yl-, and isoleucyl-tRNA synthetase family. 
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ABSTRACT Universal trees based on sequences of single 
gene homologs cannot be rooted. Iwabe et al, [Iwabe, N,, 
Kunia, K.-L, Hasegawa, M., Osawa, S. & Miyata, T. (1989) 
Proc. Natl, Acad, ScL USA 86, 9355-9359] circumvented this 
problem by using ancient gene duplications that predated the 
last common ancestor of all living things. Their separate, 
reciprocally rooted gene trees for elongation factors and 
ATFase subunits showed Bacteria (eubacteria) as branching 
first from the universal tree with Archaea (archaebacteria) 
and Eucarya (eukaryotes) as sister groups. Given its topical 
importance to evolutionary biology and concerns about the 
appropriateness of the ATPase data set, an evaluation of the 
universal tree root using other ancient gene duplications is 
essential. In this study, we derive a rooting for the universal 
tree using aminoacyl-tRNA synthetase genes, an extensive 
multigene family whose divergence likely preceded that of 
prokaryotes and eukaryotes. An approximately 1600-bp con- 
served region was sequenced from the isoleucyl-tRNA syn- 
thetases of several species representing deep evolutionary 
branches of eukaryotes (Nosema locustae)^ Bacteria (Aquifex 
pyrophilus and Thermotoga maritima) and Archaea (Fyrococcus 
furiosus and Sulfolobus acidocaldarius). In addition, a new 
valyl-tRNA synthetase was characterized from the protist 
Trichomonas vaginalis. Different phylogenetic methods were 
used to generate trees of isoleucyl-tRNA synthetases rooted by 
valyl- and leucyl-tRNA synthetases. All isoleucyl-tRNA syn- 
thetase trees showed Archaea and Eucarya as sister groups, 
providing strong confirmation for the universal tree rooting 
reported by Iwabe et al. As well, there was strong support for 
the monophyly {semu Hennig) of Archaea. The valyl-tRNA 
synthetase gene from Tr, vaginalis clustered with other eu- 
karyotic ValRS genes, which may have been transferred from 
the mitochondrial genome to the nuclear genome, suggesting 
that this amitochondrial trichomonad once harbored an en- 
dosymbiotic bacterium. 



Studies of early cellular evolution have been greatly influenced 
by two major findings of molecular systematics. First was the 
revelation from phylogenetic analyses of rRNA molecules that 
the universal tree of life consists of three domains: the Archaea 
(archaebacteria). Bacteria (eubacteria), and Eucarya (eu- 
karyotes) (1, 2). Second was the reciprocal rooting of gene 
trees for two separate paralogous gene families — the genes 
encoding elongation factors (EFs) Tu/la and G/2 (3) and the 
ATPase a and ^ subunits (3, 4) — ^which showed that Archaea 
and Eucarya are sister groups. 

Despite the recent expansive growth of gene data bases, no 
other paralogous gene phylogenies have been developed that 
might allow us to confirm the root of the universal tree. The 
phylogenies of several other macromolecules, including RNA 
polymerases (5) and many ribosomal proteins (6), are indeed 
consistent with the subdivision of life into three domains, with 
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archaeal and eukaryotic gene homologs being least distant 
from each other. However, such single gene trees cannot be 
rooted, and thus the closeness of archaeal and eukaryotes may 
simply mean that their genes mutate more slowly than do those 
of bacteria. 

New gene discoveries and recent critiques have cast some 
significant doubt on the validity of conclusions based on 
duplicated genes for EF and ATPase subunits. Recently, 
V-type-like ATFases (previously known to exist only in eu- 
karyotes and archaea), similar to archaeal V-type ATPases, 
have been found in two species of bacteria (7, 8), and a 
bacterial-like Fl- ATFase p-subunit gene has been detected in 
the archaeon Methatiosacrina barkeri (9). CoUectively, these 
data suggest that either the full family structure of ATPase- 
duplicated genes has yet to be determined (10) or that 
extensive lateral gene transfers between domains have oc- 
curred (11), thus rendering any conclusions about domain 
relationships based on ATPase gene phylogenies suspect, 
Forterre et al (10) have expressed concerns over the small 
number of amino acid positions that can be confidently aligned 
between the EF-Tu/la and EF-G/2 genes and the paucity of 
taxa used by Iwabe et al (3), Recent analyses involving a 
broader species data base, in particular new archaeal EF genes, 
produce statistically reliable trees using EF-G/2 but not 
EF-Tu/la sequences (12). Therefore, the rooting of the 
universal tree remains an important question that must be 
addressed not only through a reanalysis of existing EF and 
ATPase data but also by using other ancient dupUcated gene 
families. 

One such promising duplicated gene family comprises the 
aminoacyl-tRNA synthetases, which catalyze the esterif ication 
or "charging" of a single amino acid to its cognate tRNA 
molecule. The function and structure of aminoacyl-tRNA 
synthetases have been intensely studied, especially with respect 
to mechanisms of amino acid charging and tRNA specificity 
(ref. 13; reviewed in ref. 14), On the basis of sequence 
similarity and crystallographic structure, aminoacyl-tRNA 
synthetases are classified as being either group I (specific for 
glutamic acid, glutamine, tryptophan, tyrosine, valine, leucine, 
isoleucine, methionine, cysteine, and arginine) or group II 
(specific for threonine, proline, serine, lysine, aspartic acid, 
asparagine, histidine, alanine, glycine, and phenylalanine). 
Group I aminoacyl-tRNA synthetase all share two consensus 
amino acid motifs, "HIGH" (His-Ue-Gly-His) and "KMSKS" 
(Lys-Met-Ser-Lys-Ser), while group II synthetases lack these 
motifs but have a third consensus region "GLER" (Gly-Leu- 
Glu-Arg). Despite having similar catalytic function, groups I 
and II airiinoacyl-tRNA synthetases do not appear to be 
related in sequence or higher order structure. 

Nagel and R. Doolittle (15) showed that all aminoacyl- 
tRNA synthetases within a specific group (I or II) are related 
and that bacterial and eukaryotic versions of aminoacyl-tRNA 
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Fig. 1. Schematic diagram of PGR amplification and cloning strategy for IleRS genes. The length of IleRS gene products is known to vary from 
939 to 1081 amino acids (40). "fflGH" and "KMSKS" are amino acid motifs conserved in all group I aminoacyl-tRNA synthetase gene products. 
Initially, an approximately 950-bp region of IleRS, corresponding to fragment A, was amplified from all species using the degenerate oligonucleotide 
primers ill and il2. Fragment A was subsequently cloned and sequenced in its entirety. Later, a second round of amplifications was done with a 
species-specific primer designed to anneal within fragment A and a degenerate primer designed to the KMSKSLGN motif, which generated 
fragment B. The exception was N. locustae where the primer ileukS replaced the KMSKS primer to amplify the fragment B'. The fragments A and 
B or B' overlapped, producing approximately 1600 bp of continuoiis sequence for all species (1884 bp for N. locustae). 



synthetases charging a particular amino acid always cluster 
together to the exclusion of synthetases recognizing other 
amino acids. Their separate phylogenetic trees for group I and 
II aminoacyl-tRNA synthetases suggest that the amino acid- 
specific synthetases are ancient proteins that diverged prior to 
the emergence of prokaryotic and eukaryotic lineages. Thus, 
it is reasonable to attempt to root a universal tree derived from 
one amino acid type of group I aminoacyl-tRNA synthetase 
with the sequences of another group I aminoacyl-tRNA syn- 
thetase. 

In the present study, nearly the entire region between the 
HIGH and KMSKS motifs (about 1600 bp in length) was 
cloned and sequenced from the group I isoleucyl-tRNA Syn- 
thetase (IleRS) gene of several lower eukaryotes, bacteria, and 
archaea. This portion of the gene represents the most con- 
served region, both within and between different types of 
group I aminoacyl-tRNA synthetases. The aminoacyl-tRNA 
synthetases for three aliphatic amino acids (valine, leucine, and 
isoleucine) were chosen because (i) these synthetases appear 
(15) to be most the recently diverged (which facilitates their 
alignment) and («) prior to this study, IleRS was the only 
aminoacyl-tRNA synthetase characterized from an archaeon, 
the mctha,nogcn Methanobacterium thermoautotrophicum (16). 
In this report, additional archaeal IleRS sequences were 
determined from the species Pyrocococcus furiosus (like M. 
thermoautotrophicum, a member of the Euryarchaeota) and 
Sulfolobus acidocaldarius, a member of the Crenarchaeota. 
IleRS sequences were also determined for species which, 
according to tRNA phylogenies, are among the most deeply 
branching lineages of Bacteria \Aquifex pyrophilus (17) and 
Thermotoga maritima (18)] and eukaryotes {Nosema locustae, 
an amitochondrial microspordian). As well a new ValRS from 
the early-diverging eukaryote Trichomonas vaginalis, was se- 
quenced.t 

By rooting the IleRS gene tree with ValRS and LeuRS 
genes, our analysis provides significant, independent collabo- 
ration of the earlier conclusions of Iwabe et aL about the close 
relationship between Archaea and eukaryotes and the bacte- 
rial root of the universal tree. Furthermore, the three domains 
are shown to be separate monophyletic groups, a finding that 
is incompatible with the eocyte hypothesis of eukaryotic 
origins (19). 

tSequences reported in this paper have been deposited in the Gen- 
Bank data base (accession nos. I37096~L37098 and L37104- 
L37106). 



MATERIALS AND METHODS 

DNA Sources. Genomic DNA samples were gifts from the 
foUowing individuals: A. pyrophilus from R. Huber (Univer- 
sitat Regensburg, Germany), P. furiosus from F. Robb (Uni- 
versity of Maryland, Baltimore), Th, maritima from P. Dennis 
(University of British Columbia, Vancouver, Canada), Tr, 
vaginalis from M. MuUer (Rockefeller Institute, New York), 
and AT. locustae from A. Roger (this laboratory) prepared from 
spores obtained from ATCC (no. 30860), S. acidocaldarius 
genomic DNA was prepared from laboratory cultured cells (a 
gift from W. Zillig, Max-Planck-Institut fur Biochemie, Mar- 
tinsried, Germany). Other DNA sequences were obtained 
from public data bases.. 

PCR Amplification, Cloning, and Sequencing. An approx- 
imately 1600-bp region of the IleRS genes from A pyrophilus, 
p, furiosus, 5. acidocaldarius, and Th, maritima and a 1900-bp 
region from the JV. locustae gene were PCR-amplif ied with two 
sets of oligonucleotide primers (Fig. 1). The first set of primers 
was designed with partial degeneracy to the amino sequences 
Trp-Asp-Thr-His-Gly-Leu-Pro-Ile-Glu (WDTAGLPIE in sin- 
gle-letter code in Fig. 1) (5'-TGGGAYACNCAYGGNYT- 
NCCNRTNGA-3' named ill) and Cys-Trp-Arg-(His or Cys or 
Ser)-(Lys or Asp)-Thr-Pro (CWRHKTP in sitigle-letter code 
in Fig, 1) (complement 5'-GGNGTNTYRCWNCKCCAR- 
CA-3' named il2). This primer pair consistently amplified a 
fragment about 950 bp long in the tested species. In a second 
PCR experiment, the remaining portion of the gene was 
amplified by using a species-specific 5 '-end primer (primer 
sequences available upon request from J.R.B.) designed to 
anneal within the iil/il2-clon&d fragment and a complemen- 
tary degenerate primer designated KMSKS designed to the 
amino acid motif Lys-Met-Ser-Lys-Ser-Leu-Gly-Asn (KMSK- 
SLGN in single-letter code) (5'-RTTWCCHARWSAYT- 
TWSACATYTT-3'). For N: locustae, the KMSKS primer was 
replaced by the primer ileukS, complementary to the amino 
acid sequence Asn-Trp-Tyr-Ile-Arg-Phe-Asn (NWYIRFN in 
single-letter code in Fig. 1) (5'-RTTNARNCKDATRTAC- 
CARTT-30 located about 300 bp downstream of the KMSK- 
SLGN motif. A 1480-bp region of the ValRS gene comparable 
to the IleRS 1600-bp section was amplified from Tr. vaginalis 
by using a ValRS-specific primer vail, which matches the 5' 
end amino acid sequence Asp-His-Ala-Gly-Ile-Ala-Thr-Glu 
(DHAGIATQ in single-letter code) (5'-GAYCAYGCWGG- 
WATWGCWACNCA-3') and the KMSKS primer. 

Thermal cycle amplifications were performed in 50-iu,l final 
volume with 5 /ui of 10 X reaction buff er (500mMKCl/100 mM 
Tris-HCi, pH 8.3/15 mM MgCl2/0.1% gelatin) containing 
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dNTPs at 200 mM, primers at 5-7 ^M, sample DNA (-«50 ng), 
and 0.5 units of Thermus aquaticus DNA polymerase with 50 
/xl of mineral oil overlaid. The reaction cycles consisted of 
denaturation for 1 min at 95°C, primer annealing for 1 min at 
48''C, and extension for 2 min at ll^'C. Cycles were repeated 40 
times, and the final cycle included an extension reaction of 5 
min. Negative controls (all of the above reagents except for 
template DNA) were included in all amplification series as a 
screen for possible foreign DNA contamination. 

Amplified DNA samples were electrophoresed in 2.0% 
low-melting-point agarose gels in separate gel apparatuses, 
and the fragments were extracted either by the phenol method 
(20) or with the Prep-a-Gene kit according to the vender's 
protocols (Bio-Rad). Isolated DNA fragments were then 
subcloned into the pCRII vector by following the vendor's 
methods (Invitrogen). Double-stranded DNA was sequenced 
by using the dideoxynucleotide chain-termination method (21) 
and T7 polymerase (United States Biochemical) and foUowing 
standard protocols. One DNA strand was sequenced in its 
entirety, and, depending on the species, about 40-80% of the 
complementary strand was also determined by using internal 
oligonucleotide primers. All ambiguous regions were con- 
firmed by sequencing the opposite strand. 

Sequence Alignments. New sequences were edited by using 
the program esee (22). IleRS, LeuRS, and ValRS sequences 
were obtained from National Center for Biotechnology Infor- 
mation data base by using Network entrez software. Amino 
acid sequences were first aligned with the program multalin 
(23) and then edited by eye to better align certain conserved 
motifs. The final alignment was in good agreement with those 
done previously (15, 24). Since the placement of some gaps is 
variable, all insertions/deletions were edited from multiple 
sequence alignments, leaving 354 amino acid positions for the 
phylogenetic analysis. 

Fhyiogenetic Analyses. Phylogenetic trees were constructed 
by using both maximum parsimony and distance methods. 
Maximum parsimony analysis was done with the software 
packages paup version 3.1.1 (25) and phylip version 3.5 (26). 
The large size of this data set did not permit an exhaustive 
search for the total number of minimal-length trees. Instead, 
the program paup was used to estimate the number and length 
of minimal trees from 20 replicate random heuristic searches 
with the PROTPARS stepmatrix to specify the minimum number 
of nucleotide replacements required to change from one 
amino acid to another. The programs SEQBOOT, protpars, 
and coNSENSE of the phylip 3.5 package were used to derive 
confidence limits, estimated by 300 bootstrap-replicates, for 
branch points in the maximum parsimony tree. 

A distance matrk of pairwise comparisons of the proportion 
of different amino acids per site was constructed by using the 
program protdist (26). In our analysis, we invoked the 
*'Dayhoff' program option, which estimates the expected 
amino acid replacements per position by using a replacement 
model based on the Dayhoff 120 matrix. The programs Seq- 
BOOT, neighbor, and consense were used to derive a neigh- 
bor-joining tree with confidence limits estimated by 300 boot- 
strap replications. 

RESULTS 

Sequence Analysis. The five new IleRS and one ValRS 
sequence shared many similarities with known aminoacyl- 
tRNA synthetases of their respective type. Conserved se- 
quence motifs previously noted in IleRS genes were found in 
all five new IleRS sequences. However, some species had 
unique insertions in different regions of the molecule, ranging 
in length from as few as 1 to as many as 33 amino acids. None 
of these insertions were concordant synapomorphies among 
species. All gaps in the alignment were omitted, which left 354 
amino acid positions for the phylogenetic analysis. 



For IleRS sequences, mean intradomain sequence identity 
values were 57% for archaea, 54% for bacteria, and 61% for 
eukaryotes. As expected, mean sequence identity comparisons 
between archaea and eukaryotes (41%), archaea and bacteria 
(45%), or eukaryotes and bacteria (35%) were lower (se- 
quence alignment and pairwise distance comparisons are 
available upon request from J.R.B.). 

Phylogenetic Analyses, Only one minimal length tree was 
found after 20 random replicates of maximum parsimony 
analysis using the protpars stepmatrix in the program paup. 
This tree was 3512 steps long and showed archaea and 
eukaryotes as sister groups. This grouping occurred in 75% of 
300 bootstrap replicates of maximum parsimony analysis using 
the program protpars (Fig. 2). A similar tree topology was 
obtained with a simple progressive scalar scheme for down- 
weighting increasingly variable sites in maximum parsimony 
analysis (implemented in paup). Bootstrap analysis of this 
weighted parsimony method showed 70% support for the 
Archaea-Eucarya cladc. Unweighted parsimony also recov- 
ered the Archaea-Eucarya clade in the minimal length tree, 
although bootstrap analysis resulted in low statistical support 
(about 60%) for an Archaea-Bacteria clade. 

The distance method, using expected amino acid replace- 
ments per site (calculated by the program protdist) to 
construct a neighbor-joining tree, also supported the Archaea- 
Eucarya clade at a high bootstrap value of 85% (Fig. 3). All 
phylogenetic analyses consistently supported, with bootstrap 
confidence limits ranging from 88% to 100%, the separate 
monophyletic groups of archaea, bacteria, and eukaryotes. 

Within domains, the branching order of individual species 
was less weU resolved. Both phylogenetic methods separated 
the archaeal groups, Euryarchaeota and Crenarchaeota, al- 
though with low bootstrap confidence limits. Although the 
internal nodes are not statistically significant, the maximum 
parsimony tree appears to agree best with the expectations of 
branching order proposed by rRNA phylogenies for within 
eukaryotes (M locustae being the lowest branch) and bacteria 
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Fig, 2. Consensus maxiraum parsimony tree of IleRS, ValRS, and 
IxuRS genes using the program protpars (26). Numbers are the 
frequency of occurrence of nodes in 300 bootstrap replicates. 
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Fig. 3. Neighbor-joining tree of IleRS, VaiRS, and LeuRS genes 
using the program neighbor (26). The scale represents 0.5 expected 
number of amino acid replacements per position as determined with 
the program frotdist. Numbers are the frequency of occurrence of 
nodes that exceeded 50% of 300 bootstrap replicates. 

(Th. maritima and A, pyrophilus nearest the root). Better 
resolution of taxa within the IleRS phylogeny can likely be 
obtained by using the full-length sequences rather than only 
those positions that can be confidently aligned with ValRS and 
LeuRS (J.R.B., unpublished data). 

The product of ValRS genes of the yeast Neurospora and 
humans are utilized in both the cytoplasm and mitochondria, 
so the placement of E. coli ValRS at the root of the eukaryotes 
suggests that nuclear copies of these ValRS genes may have 
originated from a mitochondrial endosymbiont. Thus, the firm 
placement of the amitochondrial protist Tr, vaginalis with the 
rest of the eukaryote mitochondrial isoforms with high boot- 
strap confidence limits is surprising. 

DISCUSSION 

Maximum parsimony and neighbor-joining distance trees both 
show that (0 the three sets of aminoacyl-tRNA synthetase 
genes form monophyietic groups, in agreement with the 
analysis of Nagel and Doolittle (15) of the entire group I family 
of genes; (n) within the IleRS portion of the tree, Archaea, 
Bacteria, and Eucarya are separate monophyietic domains; 
and {Hi) Archaea and eukaryotes are supported as a clade 
according to heuristic search methods for the minimal-length 
tree as well as bootstrap analysis, which, under most condi- 
tions, is considered to be a conservative estimate of the 
significance of branching points (27). 

The IleRS tree provides important confirmation of the 
rooting of the universal tree in the lineage leading to the 
bacteria as suggested by the analysis of Iwabe et aL (3) of the 
duplicated genes encoding EF-Tu/la and EF-G/2. The EF 
gene analysis involved the reciprocal rooting of two gene trees, 
both of which included representative species from all three 
domains. In our study, only the IleRS gene tree has a full 
complement of archaeal, bacterial, and eukaryotic species, 



since ValRS and LeuRS genes are unknown for the archaea. 
However, we consider our result to be the strongest to date in 
support of the sisterhood of archaea and eukaryotes. The 
present IleRS data set exceeds that of the EF-Tu/la gene 
family in terms of sequence length — 354 amino acids for IleRS 
versus 120 amino acids for the joint EF alignment. Further- 
more, the IleRS dataset includes more deeply branching 
species within the eukaryotes and bacteria and a more com- 
prehensive selection of archaea. 

The analysis of duplicated genes performed by Iwabe et aL 
(3) involved only single archaeal homologs and thus did not 
address the issue of the coherence of the Archaea, Monophy- 
ietic groupings of archaeal, bacterial, and eukaryotic clades are 
strongly supported by the present phylogenetic analysis. In 
confirming a root between bacteria and archaea/eukaryotes, 
the IleRS data set also supports inferences concerning the 
monophyly of each domain based on unrootable data (for 
instance the rRNA sequences) and are inconsistent with 
treatments of this data that would place the root between the 
Euryarchaeotes and Crenarchaeotes [as in the Lake 1988 
version of the "eocyte tree" (19)]. 

The congruence of IleRS and EF gene trees is not surprising, 
given that aminoacyl-tRNA synthetases and EF Tu/la se- 
quentially interact with the tRNA-amino acid complex and, as 
such, might have coevolved functions. The greater similarity of 
the archaea to eukaryotes rather than to bacteria is supported 
by several lines of evidence involving the celFs genetic ma- 
chinery. These include recent findings of archaeal homologs to 
eukaryotic TATA-binding proteins (28, 29), transcription fac- 
tor TFIIB (30), and a TFIIS-like sequence (31) as well as the 
closer sequence similarity of genes for RNA polymerase (5) 
and many ribosomal proteins (6). 

Other data sets sometimes suggest alternative relationships 
between the three domains. For example, glutamine syn- 
thetase trees place archaea and the Gram-positive bacteria in 
the same clade: some sort of lateral transfer might be the best 
explanation (32). Similarly, some eukaryotic nuclear genes — in 
addition to those likely derived from mitochondrial or plastid 
genomes — appear of bacterial rather than archaeal origin. (An 
example is phosphoglycerate kinase). Other authors (33, 34) 
have claimed that such occurrences bespeak a radical chimer- 
ism, the eukaryotic nucleus for instance being the product of 
the fusion of the entire genomes of archaea and bacteria. 
Although the present data do not address these issues directly, 
they add to a considerable body of evidence in favor of the 
notion that the eukaryotic transcription and translation ma- 
chinery—surely the core of cell biology — ^are archaeal in 
nature. Whether other nuclear genes of apparent bacterial 
origin were acquired in some genetic cataclysm or one-by-one 
over hundreds of millions of years remains an open question. 

While eukaryotes must have a full suite of aminoacyl-tRNA 
synthetases that are functional in both the cytoplasm and the 
mitochondria, the mode of coding for specific cellular isoforms 
varies with amino acid type. For example, there are two 
separate LeuRS genes coding for cytoplasmic and mitochon- 
drial isoforms (35), while the same ValRS gene product is used 
in both cellular locations (36-38). Our analysis suggests that 
eukaryotic cytoplasmic IleRS are the product of ancient 
nuclear genes, while the single eukaryotic ValRS may have 
been of bacterial (endosymbiotic mitochondrial) origin. Thus, 
the placement of the amitochondrial protist, Tn vaginalis, with 
the remaining eukaryotes is a surprising result and suggests 
that the nuclear genome of Tr. vaginalis may have experienced 
a similar introduction of certain genes from an endosymbiont. 
While trichomonads lack a mitochondrion, they do have 
another organelle, the hydrogenosome, which may have had an 
endosymbiotic origin (reviewed in ref. 39). However, any 
conclusions about the relationships among ValRS genes must 
remain highly speculative, given the limited number of genes 
known from lower eukaryotes. Furthermore, archaeal se- 
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quences would be essential for determining the exact topology 
of the ValRS portion of the tree. 

Our analysis opens several new avenues of research. Group 
I, as well as group II aminoacyl-tRNA synthetases are large 
multigene families that conceivably offer several other oppor- 
tunities for testing the root of the universal tree. Although 
group I and II aminoacyl-tRNA synthetases have the same 
catalytic function, the two gene families are highly divergent at 
the amino acid sequence level and appear to employ different 
modes of tRNA recognition (14, 40). Given the evolutionary 
distinctiveness of the two groups of aminoacyl-tRNA synthe- 
tases, it has been postulated that, at one time, there may have 
been two independent protein synthetic systems working with 
reduced sets of amino acids that subsequently merged into the 
present-day genetic code (15). The degree of congruence 
between group I and II aminoacyl-tRNA synthetase gene trees 
using new archaeai sequences might provide further insight 
into the level of refinement of the genetic machinery of the last 
common ancestral ceil. 
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We have sequenced the valyl-tRNA synthetase gene {valS) of Bacillus subtilis and found an open reading 
frame coding for a protein of 880 amino acids with a molar mass of 101,749. The predicted amino acid sequence 
shares strong similarity with the valyl-tRNA synthetases from Bacillus stearothermophilus, Lactobacillus casei, 
and Escherichia colL Extracts of B. subtilis strains overexpressing the valS gene on a plasmid have increased 
valyl-tRNA aminoacylation activity. Northern analysis shows that valS is cotranscribed with the /o^C gene 
(encoding folyl-polyglutamate synthetase) lying downstream. The 300-bp 5' noncoding region of the gene 
contains the characteristic regulatory elements, T box, "specifier codon" (GUC), and r/?o-independant tran- 
scription terminator of a gene family in gram-positive bacteria that encodes many aminoacyl-tRNA synthetases 
and some amino acid biosynthetic enzymes and that is regulated by tRNA-mediated antitermination. We have 
shown that valS expression is induced by valine limitation and that the specificity of induction can be switched 
to threonine by changing the GUC (Val) specifier triplet to ACC (Thr). Overexpression of valS from a 
recombinant plasmid leads to autorepression of a valS-lacZ transcriptional fusion. Like induction by valine 
starvation, autoregulation of valS depends on the presence of the GUC specifier codon. Disruption of the valS 
gene was not lethal, suggesting the existence of a second gene, as is the case for both the thrS and the tyrS genes. 



The aminoacyl-tRNA synthetases (aaRS) catalyze the cova- 
lent attachment of amino acids to their cognate tRNAs, a 
reaction crucial for the accuracy of protein synthesis. For the 
most part, there is only one aaRS for each amino acid species 
in bacteria, although several exceptions are known. The pres- 
ence of two very similar lysyl-tRNA synthetases represents the 
singular exception in Escherichia coli (21, 22, 26), where the 
tRNA synthetases for all 20 amino acids have been cloned (12). 
The situation is different in gram-positive organisms. On the 
one hand, they lack a glutaminyl-tRNA synthetase (43), and on 
the other hand, there are two distinct threonyl-tRNA syn- 
thetase genes {thrS and thrZ [32]) and two tyrosyl-tRNA syn- 
thetase genes {tyrS and tyrZ [9, 20]) in Bacillus subtilis and two 
histidyl-tRNA synthetase genes in Lactococcus lactis (36). 
Chances are that other duplicate genes will be identified with 
further progress in the various genome-sequencing projects. 
We have previously shown that the normally silent thrZ gene is 
induced during threonine starvation or by reducing the intra- 
cellular concentration of the housekeeping synthetase, ThrS 
(33). 

In contrast to E. coli, in which the mechanisms for aaRS 
gene regulation are as disparate as the number of genes stud- 
ied (for a review, see references 12 and 34), most of the B. 
subtilis genes isolated appear to be regulated by a common 
mechanism. Of the 15 tRNA synthetase genes cloned and 
sequenced in B. subtilis (for a review, see references 4 and 34), 
all but the asparaginyl {asnS [2])-, glutamyl {gltX [7])-, lysyl 
{lysS [31])-, and methionyi {metS [31])-tRNA synthetase genes 
share common sequence and structural motifs in the leader 
regions upstream of the translation initiation site (14), Their 
leader regions are about 300 bp long, and each contains a 
transcriptional terminator immediately preceded by a 14-nu- 
cleotide consensus sequence known as the T box (19, 20, 33). 



* Corresponding author. Mailing address: UPR 9073, CNRS, Insti- 
tut de Biologie Physico-Chimique, 13 rue Pierre et Marie Curie, 75005 
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putzer@ibpc.fr. 



This configuration is found not just in the aaRS genes but also 
in several of the amino acid biosynthetic operons in Bacillus 
spp. and other gram-positive organisms (13, 34). The leader 
region of the thrZ gene extends over 800 bases and comprises 
three such tandem domains (33). 

For several genes of this family, it has been shown that they 
are specifically induced by starvation for their cognate amino 
acid via a mechanism involving transcriptional antitermination. 
This is the case in the tyrS (14), pheS (35), and thrS and thrZ 
genes (33) and the ilv-leu operon (11). thrS and thrZ are also 
autorepressed by overproduction of the synthetases themselves 
(8, 33). 

Base pairing between part of the conserved T-box sequence 
and an equally conserved sequence in the 5' half of the termi- 
nator stem can lead to the formation of an alternative, and 
mutually exclusive, structure called the antiterminator (14, 33). 
Studying the tyrS system, Grundy and colleagues first provided 
evidence that the uncharged tRNA can stabilize the formation 
of the antiterminator structure by interacting with two sites in 
the leader mRNA (14, 15). The first is between the anticodon 
loop of the uncharged tRNA and a "specifier codon" that is 
likely to be bulged out of a large stem-loop structure found in 
the 5' half of the leader RNAs of this gene family. The second 
proposed interaction occurs through base pairing between 
the NCCA-3' acceptor end of the uncharged tRNA (includ- 
ing the discriminator base) and the perfectly complementary 
-UGGN'- sequence in the T box, which is bulged out in the 
antiterminator conformation (14, 15). Several mutational stud- 
ies have now been carried out in different systems, and they 
basically confirm the importance of the specifier codon for the 
specificity of induction during cognate amino acid starvation 
(14, 28, 35), Changing the identity of the specifier codon has, in 
many cases, permitted a switch in the identity of the regulatory 
amino acid. 

The role of the discriminator base in stabilizing the interac- 
tion between the acceptor end of the uncharged tRNA and the 
T box has been studied for tyrS {15), pheS (35), and thrS (35). 
These reports show that while this interaction is important, the 
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1 CCATGGCGGCACGAATAAGAGCTGAAACGTTTCTTCACGGTGCATCCTCCTCTACGCATTTACAAGCATCATATCTAAACTTGGGCAATA 9 0 

i I 1 I I 1 1 I r^+a. 

9 1 GGTGCCTGCCCATCTGTATAGAAAGCGGTGTTTT'I^GAAAAGATGATTCACGAATCAAAAACACTTTTACT^ ISO 

-35 -10 

i 5 ^ ^ ' ' ' ^ 

181 ATGTATACAATGAGCTGATAAATAC ACCTGCTTAGAACGGGAAGAGTAC AAAAAAGGCGCATGC AGAGAGAGAAGCCGTTTGCTGAAAGG 27 0 

I i i I i f I ! 

271 CTTCTTATGTTGTCTGATTGGAAG^T^GCCCGGGAGCATAATTTCTTGAAAGAAGAGTAGAGAAATTCGGGAAACACCCGTTATCCGTTA 3 60 

i i I j 1111 

361 TAAGTGCATGAACTGATTGAGTTCAT dAAAAAAGGTGGTACCGCcjAAAGAGCTTTTCGTCCTT T^^ 450 

I I I I I i I 1 

451 TTCTCAGCCATTTTTAAACATGCTGGAGGGTTATGATCG AAACGAATGAACAAACAATGCCGACGAAATATGATCCGGC AGCGGTT^3AAA 540 

Sr> METNEQTMPTKYDPAAVE 

FIG 1 Nucleotide sequence of the 5' noncoding region of the J5. subtilis valS gene. The consensus promoter sequences (-35 and - 10 regions) are underlined. The 
bent arrow indicates the 4-1 transcription start point. The deduced Shine-Dalgamo-type sequence (SD) is underlined. Converging arrows indicate a potential 
Rho-independent transcription terminator. The specifier codon (GTC) and the T-box consensus sequence are boxed. The sequence of the whole gene has been 
deposited in the GenBank/EMBL databases. 



sometimes ambiguous results obtained with diferent mutants 
suggest that other points of interaction between the tRNA and 
mRNA, and possibly protein factors, are involved in regula- 
tion. 

An additional level of regulatory complexity was recently 
introduced with the discovery that the leader mRNA of thrS 
and at least five other members of this gene family is cleaved 
just upstream of the transcription terminator in vivo (5). The 
processed thrS transcript is significantly more stable than the 
full-length mRNA and is the predominant form under threo- 
nine starvation conditions. Even though processing can occur 
in the absence of the tRNA-leader interaction, its contribution 
to overall induction levels following threonine starvation is 
substantial (5). 

One of the reasons we have studied the expression of the 
valS gene is to find out whether the different aspects of regu- 
lation described above apply to other genes of this family or if, 
on the contrary, some of the regulatory mechanisms are con- 
fined to specific genes. For example, all the genes cited above 
are induced by tRNA-mediated antitermination, but autoreg- 
ulation has thus far been described only for thrS and thrZ 
expression. The only other gene tested in this respect, pheS, 
although induced by phenylalanine starvation, was not re- 
pressed by overproduction of phenylalanyl-tRNA synthetase. 

In this report, we describe the identification, sequencing and 
characterization of the valS gene. We analyze its transcription 
pattern and the importance of the specifier codon and the T 
box for the specificity of induction by valine starvation. Fur- 
thermore, we provide evidence that valS, like thrSjthrZ but 
unlike pheS, is autoregulated. 

MATERIALS AND METHODS 

Bacterial strains and culture and transformation conditions. All B. subtilis 
strains used in this study are derivatives of the prototrophic strain 168 (BGSC 
1A2) or the auxotrophic strain BGSC 1A232 {ilvD4 trpDI), containing valS-lacZ 
fusions integrated into the amy locus. Strains were grown in M9 minimal medium 
(29) supplemented with 0.5 mM Trp, 3 mM He, 3 mM Leu, 3 mM Val, and trace 
elements (17). For valine starvation experiments, ceils were grown as just de- 
scribed but in the presence of only 0.6 mM Vai and harvested for [5-galactosidase 
measurements 2 h after the end of logarithmic growth. Threonine starvation was 
achieved by the addition of 600 of dt .-threonine hydroxamate per ml to a M9 
medium culture at an optical density at 600 nm of 0.3 to 0.4 (prototrophic strain), 
which still allowed logarithmic growth. Cells were harvested 2 h later. 

Flasmid manipulations were performed in E. coli JM109 [recAl endAl gyrA86 
thi hsdR17 supE44 relA X™ ^{lac-proAB), F\traD36 proAB lacl'i lacZAMlS)]. 
E. coli KE89 (F" endAl hsdRl hsdM* supE44 thi-1 pcnB) served as a host for 
overexpression studies with the ra/5'-containing plasmid pHMVll, since this 
high-copy-number plasmid could not be stably maintained in a pcnB^ strain. 



Concatemeric plasmids for transformation of J?, subtilis were isolated from £. coli 
JMlOl [thi supE44 AQac-proAB) F'{traD36 proAB^ lacl"^ lacZLMlS)]. 

E. coli cells were transformed by electroporation (37), and B, subtilis cells were 
transformed as described elsewhere (25). E. coli transformants were selected on 
LB plates supplemented with 100 fjtg of ampicillin per ml, and B. subtilis trans- 
formants were selected on LB plates with 4 p,g of chloramphenicol (integrative 
plasmids) or 10 M-g of tetracycline (replicative plasmids) per ml. 

Plasmid constructions. Plasmid pDG1129 was a generous gift from P. Stragier. 
It was constructed by insertion of a 3,15-kb Ncol-Xbal chromosomal DNA 
fragment containing the valS gene into the vector pMTL22 (3), which was cut 
with the same enzymes. 

For pHMV4, a 1-kb BgEl-Hindlll fragment (coordinates 1 to 978 of the valS 
sequence) from p0G1129 was inserted into plasmid pTZlSR (USB) cut with 
Bamm and HindJlL 

For pHMVS, the 1-kb insert of pHMV4 was excised as an EcoBX-Hindlll 
fragment and cloned into plasmid pHM2 (8) cut with EcoBl and HindllL 

For pHMVll, the 3.15-kb insert of pDGli29 contakiing the entire valS gene 
was excised as an Nsil-Xbal fragment and inserted into the shuttle vector pHM3 
(33) cut with Pstl and Xbal, 

For pHMV12, an internal 1.5-kb Hindlll fragment of valS was inserted into 
the integrative vector pDG641 (16) cut with Hindlll, 

For pHMV13, the 1-kb insert of pHMV4 was mutated at two sites: the GUC 
triplet (coordinates 295 to 297 in Fig. 1) was altered to ACC, and the -TGGT- 
sequence of the T box (coordinates 396 to 399 in Fig. 1) was changed to -TGGA-. 
The mutated fragment was excised with Eco^l and Hindlll and inserted into 
pHM2 cut with Ecom and Hindlll. 

For pHMV14, the 1-kb insert of pHMV4 where the GUC specifier codon has 
been mutated to UAA was excised with ScoRI and HmdIII and inserted into 
pHM2 cut with £coRI and HmdIII. 

For pHMVlS, the 1-kb insert of pHMV4 where the GUC speciHer codon has 
been mutated to ACC was inserted as an Eco RLZ/iwdlll fragment into pHM2 
cut with Eco~BJ and Hindlll. 

DNA manipulations. The 3.15-kb insert of plasmid pDG1129, containing the 
entire valS gene, was subcloned as three separate fragments in the multicopy 
plasmids pTZlSR and pTZ19R (USB) for sequencing. The double-stranded 
recombinant DNAs were used as templates in dideoxy chain termination sequenc- 
ing reactions (38), using the universal and reverse primers as well as specific 
synthetic oligonucleotides for the central regions of the cloned fragments. 

Site-directed mutagenesis was performed on a single-stranded DNA template 
by the method of Kunkel et al. (24). Mutations in the valS leader were generally 
introduced on plasmid pHMV4 before being transferred to the lacZ fusion 
vector pHM2. Oligonucleotides used for mutagenesis extended 12 to 15 nucle- 
otides on either side of the mutation site, and sequences are available on request. 

RNA manipulations. Total cellular RNA was isolated as described previously 
(33). Reverse transcriptase assays were carried out with 15 p-g of total RNA and 
about 1 pmol of 5 '-end-labeled oligonucleotide (sequence complementary to 
positions 479 to 498 in Fig. 1). The RNA and oligonucleotide were heated 
together at 65" for 5 min and then frozen in a mbcture of dry ice and ethanol and 
allowed to thaw on ice. Reactions contained 2 U of avian myeloblastosis virus 
reverse transcriptase (Eurogentec) and were allowed to run for 30 min at 48'*C. 

Northern analysis of total cellular RNA was performed as described elsewhere 
(33). A radiolabeled 1.5-kb J^mdlll fragment of the valS structural gene (see Fig. 
5) was used as a va^^-specific probe. The/o/C probe was amplified by PGR from 
chromosomal DNA (positions 73 to 1096 of the structural gene; see Fig. 5). 

ji-Galactosidase and aminoacylation assays. The p-galactosidase activity of 
lacZ fusions was measured as described previously (33). 
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TABLE 1. ValS activity in total cell extracts of v^/5-overexpressing 
E. coll and B, subtilis cells^ 
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FIG. 2- Putative secondary structures of the specifier dornain, the antitermi- 
nator, and the terminator of the B. subtilis valS leader. The GUC triplet and the 
-UGGU- sequence in the antiterminator that are believed to interact with the 3' 
end of the Val-tRNA*^-^*^ are in boldface type. Other conserved sequences (14) 
are marked by asterisks. 



For in vitro amino acylation measurements, B. subtilis orE, coli cells harboring 
recombinant plasmids containing valS or the vector alone were grown in LB 
broth to an optical density at 600 nm of ~1. Cells were harvested and washed 
with Z buffer without (3-mercaptoethanol (29). After suspension in 500 m-I of 
buffer A (10 mM Tris-HCl [pH 7.4], 10% glycerol, 1 mM dithiothreitol), samples 
were sonicated and clarified by centrifugation. The 100-^1.1 amino acylation reac- 
tion was carried out at 37''C in a reaction mixture containing 50 mM Tris-HCl 
(pH 7.5), 15 mM MgCI^, 15 mM ^-mercaptoethanol, 10 mM ATP, 50 pM 
L-i'*C-valine at 200 cpm/pmol, 1 mM dithiothreitol, 120 \x.g of total E. coli tRNA, 
and various amounts of cellular extract. The nucleic acids were precipitated by 
trichloroacetic acid and filtered out on GFC filters (Whatman), and the radio- 
activity retained on the filters was measured by scintillation counting. 

Computer analysis. Sequence comparisons were done with the help of the 
programs BestFit and PileUp of the University of Wisconsin Genetics Computer 
Group. 

Nucleotide sequence accession number* The nucleotide sequence of the valS 
gene has been deposited in the GenBank/EMBL databases under the accession 
number X77239. 



RESULTS 

Identification of the valS gene. The putative B. subtilis valS 
gene encoding valyl-tRNA synthetase (ValS) had previously 
been identified by a homology search of a sequence upstream 
of the/o/C gene (encoding folyi-polyglutamate synthetase) that 
comprises the C-terminal 56 amino acids of a truncated open 
reading frame (27). Piasmid pDG1129 was constructed by P. 
Stragier (unpubhshed data) and carries a 3.15-kb fragment 
containing the chromosomal region immediately upstream of 
folC, including the sequence described above. We sequenced 
this 3T5-kb fragment and found it to contain the entire valS 
transcriptional unit. A 540-bp segment of the 5' end of this 
sequence contains the valS leader region and is shown in Fig- 
1. An open reading frame encoding a protein of 880 amino 
acids was identified between positions 486 and 3125 (data not 
shown). The deduced protein sequence shows strong similarity 
to the valyl-tRNA synthetases of Bacillus stearothermophilus 
(1), Lactobacillus casei (40), and E, coli (18). Sequence align- 
ment of the four known prokaryotic synthetases shows that, as 
expected, the B. subtilis ValS is more closely related to its 
homologs from the gram-positive organisms B. stearother- 



Bacterial strain 


ValS activity (pmol of charged 
tRNA'^^Vp.g of total protein) in: 


Overexpression 
(fold) 




Vector (pHM3) Vector + valS^' 


E. coli KE89 

B, subtilis SSB184'^ 


1.1 6.5 
11 27 


6 

2.5 



" Ail data are average values from three independent experiments, 
^ pDG1129 in E. coli; pHMVll in B. subtilis. 

^ SSB184 is the B. subtilis wild-type strain 1A2 (BGSC) containing the valS- 
lacZ fusion HMV8. 



mophilus and Lactobacillus casei (89% similarity and 80% 
identity, and 75% similarity and 61% identity, respectively) 
than to the E. coli enzyme (67% similarity and 46% identity). 
The N-terminal two-thirds of the protein is well conserved 
between all four organisms. This part of the protein contains 
the catalytic core (23, 41), including the signature sequences 
HIGH and KMSKS of the Rossmann nucleotide binding fold 
in class I aaRS (6). However, the E. coli synthetase contains 
some quite extensive insertions in this part of the protein that 
are found in none of the other three synthetases, possibly 
reflecting species-related differences between gram-positive 
and gram-negative organisms. The L. casei enzyme has a 19- 
amino-acid N-terminal extension compared to the two Bacillus 
synthetases. The C-terminal third of the four ValS proteins is 
more divergent and emphasizes the close evolutionary distance 
between the two Bacillus species. The supposition initially ad- 
vanced by Wetzel (42), that valyl-, isoleucyl-, ieucyl-, and me- 
thionyl-tRNA synthetase are all members of a subfamily within 
the aaRS, is supported by aligning the sequences of these 
proteins from different origins (see Discussion), 

We identified a potential aA-type promoter (Fig. 1) with a 
spacing of 17 bp and a near-consensus sequence (TTGACA 
and TATAAT for B, subtilis a^- and E. coli o-'^^-type promot- 
ers) -^300 nucleotides upstream of the start codon. Its func- 
tionaUty has been confirmed, as described below. The roughly 
300~bp leader contains all of the regulatory elements necessary 
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FIG. 3. Primer extension analysis of B. subtilis valS mRNA. Total RNA of a 
B. subtilis wild-type strain was reverse transcribed with a primer complementary 
to nucleotides 479 to 498 in Fig. 1. The same oligonucleotide was used for the se- 
quencing reaction with piasmid pDG1129 as a template. RT, reverse transcription. 
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FIG. 4. Northern blot analysis otvalS transcripts. (A) A radiolabeled 1.5-kb 
Hindlll fragment of the valS structural gene (Fig. 5) was used to probe total 
RNA extracted from a B. subtilis wild-type strain. (B) The same blot was stripped 
of the valS probe and rehybridized with a/o/C PGR probe (positions 73 to 1096 
of the structural gene; Fig. 5). The sizes of the two mRNA species were estimated 
using the BRL 0.24- to 9.5-kb RNA molecular weight marker. 



to assign it to the family of genes regulated by tRNA-mediated 
antitermination (Fig. 1 and 2): a Rho-independent transcrip- 
tion terminator which is preceded by the T-box consensus 
sequence upstream of the structural gene and a highly struc- 
tured specifier domain in the 5' half of the leader which con- 
tains the potential GUC specifier codon. We have analyzed the 
importance of these elements for valS regulation (see below). 

The valS gene product can charge tRNA^^^ in vitro. In order 
to prove the identity of the sequenced gene, we overexpressed 
it in both E. coll and B, subtilis and measured an increase in 
tRNA^"*^ aminoacylation activity in cell extracts in vitro. For 
overexpression in E, coli, we transformed plasmid pDG1129 
into the pcnB strain KE89, since this high-copy-number recom- 
binant plasmid could not be stably maintained in a pcnB^ 
strain. The valS gene was transferred to B. subtilis by transfor- 
mation with plasmid pHMVll, constructed by inserting the 
3,15-kb insert of pDG1129 into the shuttle vector pHM3. The 
aminoacylation activities found in the various cell extracts are 
given in Table 1. The 6- and 2.5-fold increases in activity in E. 
coli and B. subtilis cells, respectively, that were harboring the 
recombinant plasmids clearly show that the cloned gene en- 
codes a functional valyl-tRNA synthetase. The difference in 
increase in absolute ValS activity between K coli and B. subtilis 
harboring plasmids pDG1129 and pHMVll, respectively, could 
reflect a lower plasmid copy number in E. coli (pcnB) than in 
B. subtilis or a lower expression of the heterologous B, subtilis 
valS gene in E. coli. The 2.5-fold increase in ValS activity ob- 
served in B. subtilis also serves as a reference value for the 
autoregulation studies described below. 

Mapping of valS transcripts. The transcription start site of 
valS was determined by primer extension analysis with an oli- 
gonucleotide complementary to nucleotides 479 to 498 in Fig, 
1, Reverse transcription reactions identified a single band cor- 



responding to a transcription start point at position 174 (Fig. 
3), which is consistent with the proposed promoter. 

Northern analysis of valS transcripts during exponential 
growth, using a 1.5-kb valS internal Hindlll fragment as a 
probe (see Fig. 5), revealed two major transcripts of 3 and 4.4 
kb (Fig, 4A), Some larger RNAs appear to be carried along in 
front of the 23S rRNA to give a weak additional signal. A 
probe specific for the folC gene located immediately down- 
stream of valS (Fig. 5) also hybridizes to the 4.4-kb transcript 
but does not hybridize to the 3-kb mRNA (Fig. 4B). Thus, we 
beheve that the 3-kb mRNA species corresponds to the valS 
mRNA and results from transcription termination at the Rho- 
independent terminator located in the short intergenic region 
between valS and folC (Fig. 5) (30, 39) and that the 4.4-kb 
transcript is a polycistronic mRNA comprising both the valS 
and the folC genes, 

A GUC triplet confers the specificity of valS induction. The 
expression of the wild-type valS gene and that of various leader 
mutants was studied with the help of lacZ transcriptional fu- 
sions integrated in single copy at the amy locus of a wild-type 
strain or a strain auxotrophic for valine. The wild-type valS- 
lacZ fusion (HMV8) was induced almost threefold by starva- 
tion for valine. It is noteworthy that efficient valine starvation 
could be achieved only by adding excess leucine to the me- 
dium, despite the fact that the strain used (1A232) is not a 
leucine auxotroph (see Discussion), To test the relevance of 
the GUC triplet (Fig. 1 and 2) to valS induction during valine 
starvation, we measured p-galactosidase activity in fusions 
where the amino acid identity of this triplet had been changed. 
HMV15 has the GUC specifier codon replaced by an ACC 
triplet, the threonine codon which confers specificity of thrS 
induction. HMV13 contains a mutation in the T box (TGGT 
TGGA) in addition to the GUC ACC mutation to retain base 
pairing with the discriminator base (U) of the Thr-tRNA^^^ 
isoacceptor. HMV14 has a TAA stop codon in place of the 
wild-type GUC triplet. The results are summarized in Table 2. 
Changing the GUC (Val) to an ACC (Thr) triplet causes loss 
of induction by valine starvation and renders expression induc- 
ible by threonine starvation (3.4-fold). At the same time, the 
basal level of expression decreases more than 10-fold. Adap- 
tation of the T-box sequence to better accommodate the in- 
teraction of the valS leader with the Thr-tRNA*^"^'"^ isoaccep- 
tor restores the basal level of expression to wild-type levels but, 
paradoxically, causes a near loss of induction by threonine 
starvation (Table 2). Replacing the GUC specifier codon with 
TAA (stop codon) renders the valS gene uninducible. 

valS expression is autoregulated. We previously showed that 
expression of thrS and thrZ, but not pheS, is autoregulated in a 
speciher-codon-dependent manner. In order to analyze wheth- 
er autorepression is confined to the thrSjthrZ system or repre- 
sents a more widespread phenomenon, we introduced the 
recombinant ValS overproducing plasmid, pHMVll, into a 



TABLE 2. Effect of specifier codon and T-box mutations on induction of valS-lacZ expression 



valS-lacZ 
fusion** 


Specifier 
codon 


T-box 
sequence 






-Galactosidase activity (U/mg)^ 




Complete 
medium 


Valine 
starvation 


Induction 


Threonine 
starvation 


Induction 


HMV8 (wt) 
HMV15 
HMV13 
HMV14 


GUC (Val) 
ACC (Thr) 
ACC (Thr) 
UAA Cstop) 


-UGGU- 
-UGGU- 
-UGGA- 
-UGGU- 


22 

1.6 
15 

0.9 


56 
0.6 
5 

0.5 


2.6X 
0.4X 
03 X 
0.6X 


21 
5.4 
18 
ND 


0.9X 
3,4 X 
1.2X 
ND 



" These fusions were measured in a strain auxotrophic for valine, strain 1A96 (see Materials and Methods), wt, wild type. 
^ ND, not done. All data represent average values from at least three independent experiments. 
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FIG. 5. Chromosomal neighborhood of the B. subtilis valS gene (39). The 
valS promoter and potential Rho-independent transcription terminators are 
indicated. Wavy arrows symbolize the mRNA species observed by Northern 
analysis (Fig. 4). The lines labeled V and F above the valS and/o/C genes indicate 
the sizes and positions of the fragments used as probes in Northern analysis (Fig. 
4). H, Hindm. 



Strain carrying the wild-type valS-lacZ fusion (HMV8). As 
shown in Table 3, a 2.5-foid increase in valS activity was suf- 
ficient to repress the activity of the valS-lacZ fusion over 5-fold. 
Thus, expression of valS appears to be extremely sensitive to 
variations in the intracellular concentration of the synthetase. 
While a direct role for the synthetase in valS regulation cannot 
be ruled out at present, it appears more Mkely that autoregu- 
lation occurs by altering the ratio of charged to uncharged 
valyl-tRNA. Due to the extremely low levels of p-galactosidase 
expression in the stop codon mutant fusion (HMV14, Table 2) 
and the GUC ^ ACC mutant fusion (HMV15, Table 2), we 
could not test them for autoregulation. Therefore, we analyzed 
the importance of the specifier codon for autorepression in the 
double mutant HMV13 fusion (specifier codon and T box 
adapted to match the Thr-tRNA*^^^^ isoacceptor; Table 3), 
which has a higher basal level of expression (Table 2). The 
double mutation led to a loss of autoregulation (1.7-fold re- 
pression), underlining the importance of these two sites of 
tRNA-mRNA interaction for this type of regulation. Although 
no repression was observed with a 10-fold overproduction of 
ThrS (Table 3), this is perhaps not surprising given that the 
HMV13 fusion is also not inducible by threonine starvation 
(see above). 

DISCUSSION 

The three valyl-tRNA synthetases from the gram-positive 
organisms B, stearothermophilus (1), Lactobacillus casei (40), 
and B, subtilis are very similar and more compact than their E, 
coli counterpart (18), which contains some extensive insertions 
in the amino-terminal two-thirds of the protein. Comparison of 
the B. subtilis ValS sequence with other branched-chain aaRS 
proteins in bacteria revealed surprisingly strong similarities 
between B, subtilis ValS and the following synthetases (ex- 
pressed in percentage per unit length, similarity and identity): 
B. subtilis MetS, 49.2 and 26.2%; E. coli MetS, 45.6 and 21.4%; 
E. coli IleS 49.5 and 26.3%; B. subtilis LeuS, 50.9 and 25.2%; 
and E. coli LeuS, 54.3 and 29.8%. It is interesting that simi- 
larities between heterologous synthetases are not necessarily 
higher when they originate from the same organism {B. subtilis 
ValS is 50.9% similar to B, subtilis LeuS but 54.3% similar to 



E. coli LeuS), implying that the common ancestor of the 
branched-chain aaRS probably existed before the separation of 
bacteria in a gram-positive and gram-negative kingdom. 

Northern blot analysis revealed the presence of two tran- 
scripts (3 and 4.4 kb) containing the valS gene. The 4.4-kb 
transcript also hybridized to a /oZC-specific probe, indicating 
that both genes are likely to be cotranscribed on a polycistronic 
mRNA originating at the valS promoter. This is also the pre- 
dicted length of a transcript extending from the valS promoter 
to the transcription terminator immediately downstream of 
folC. The presence of roughly equal amounts of the two mR- 
NAs indicates that the valS terminator is only about 50% 
efficient. Overexposure of the Northern blot revealed only very 
low quantities of transcripts extending beyond the folC tran- 
scription terminator. This is consistent with the finding that 
expression of comC, the gene lying downstream of folC (Fig. 
5), is induced only during late competence (30). 

We attempted to inactivate the valS gene on the chromo- 
some and found this not be lethal. Disruption of valS in the 
survivors was confirmed by Southern blotting (data not shown) 
and suggests that a second functional gene with valine-tRNA 
synthetase activity exists in B. subtilis, as is the case for the 
threonyl- and tyrosyl-tRNA synthetases. 

Sequence and two-dimensional structure analyses of the valS 
leader suggested that this gene is a member of the family of 
genes in gram-positive organisms that comprises aaRS and 
amino acidbiosyntheticgenes regulated by tRNA-mediated anti- 
termination (14). Expression of was induced by starvation 
for valine, but this derepression could be observed only when 
the cells were grown in the presence of excess leucine, despite 
the fact that the trpC2 ilvD4 mutant strain used in this study is 
auxotrophic for tryptophan, isoleucine, and valine but not for 
leucine. A rationale for this observation may be found in the 
way the ilv~leu biosynthetic operon is regulated. Expression of 
the ilvAeu operon is also likely to be regulated by the level of 
charged/uncharged Leu-tRNA via tRNA-mediated antitermi- 
nation (28) and responds to variations in leucine concentration 
(10). Since the uncharacterized ilvD4 mutation used here 
shows a slightly leaky phenotype, we believe that excess leucine 
further shuts down ilv-leu expression, thereby creating condi- 
tions whereby valine starvation can occur more efficiently. 

The specificity of valS induction depends on the identity of 
the strategically placed GUC (Val) triplet (specifier codon) in 
the extensive 5'-terminal secondary structure (Fig. 2). Chang- 
mg the GUC triplet to ACC (Thr) switched the specificity of 
induction from valine to threonine starvation. However, the 
GUC ^ ACC transition leads to a more than 10-fold drop in 
the basal level of expression, very close to the activity of an 
uninducible fusion in which the GUC specifier was mutated to 
a UAA stop codon. Clearly, the Thr-tRNA*^^^ isoacceptor 
interacts much less efficiently with the valS leader containing 
the ACC codon than Val-tRNA interacts with the wild-type 
leader. In order to improve this interaction, we altered the 
T-box sequence to match the discriminator base of the Thr- 



TABLE 3. Effect of valS and thrS overexpression on wild-type and mutant valS4acZ fusions 



valS'lacZ 
fusion 


Specifier 
codon 


T-box 
sequence 


Multicopy 
plasmid 


Insert 


p-Galactosidase 
sp act (U/mg)" 


Repression 
factor (fold) 


Overexpression of 
synthetase (fold) 


HMV8 (wt)^ 


GUC 


-UGGU- 


pHM3 


Control 


39 


5.2 


2.5 






pHMVll 


valS 


7.5 


HMV13 


ACC 


-UGGA- 


pHM3 


Control 


44 


1,7 


2.5 








pHMVll 


valS 


26 



" All p-galactosidase activities are average values from at least three independent experiments. 
^ wt, wild type. 
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tRNA*^^^ isoacceptor. Indeed, basal expression rose about 
10-fold and approached wild-type levels but at the same time 
became almost uninducible by threonine starvation (Table 2). 
This phenomenon is difficult to explain. The GUC ACC 
transition (HMV15) was sufficient to render the mutated valS- 
lacZ fusion inducible by starvation for threonine, indicating 
that the Thr-tRNA^^" isoacceptor can, albeit not very effi- 
ciently, interact with the valS leader and recognize the ACC 
specifier codon. Permitting the T-box sequence to base pair 
with the discriminator base of the Thr-tRNA'^'^^ (HMV13) 
seemed to improve the tRNA-mRNA interaction, as reflected 
by a 10-fold increase in basal expression. However, if this 
interaction is indeed so efficient, one would expect this mutant 
fusion to be highly inducible by threonine starvation, which is 
clearly not the case. We previously encountered a similar phe- 
nomenon when introducing analogous mutations in the thrS 
leader to complement the discriminator base of Phe-tRNA 
(35). 

It seems logical that genes whose expression responds to the 
ratio of charged to uncharged cognate tRNA would also be 
affected by the intracellular concentration of then product 
responsible for charging these tRNAs, but of the two systems 
studied to date (thrS/thrZ [8] and pheS [35]), only the thrS and 
thrZ genes were autoregulated. To see whether this type of 
regulation represents a more common phenomenon was one of 
the reasons we tested whether valS expression is autoregulated. 
As shown in this study, a 2.5-fold overexpression of valS from 
a multicopy plasmid led to a more than 5-fold repression of a 
wild-type valS-lacZ fusion. Expression of the valS gene is thus 
more tightly controlled by the intracellular concentration of its 
product than is the case for thrS, where 10-fold overproduction 
leads to 10-fold repression (8). This could be explained if basal 
valS expression were to depend much more heavily on antiter- 
mination mediated by uncharged tRNA than is the case for 
thrS, In minimal medium, expression of a valS-lacZ fusion, in 
which the GUC specifier codon has been changed to a TAA 
stop codon, drops 20-fold compared to expression of the wild- 
type fusion (Table 2), while an equivalent change in the thrS 
system leads to only a 2-fold drop in expression (35). While it 
seems likely that autorepression occurs by altering the charged/ 
uncharged tRNA ratio in the cell, an additional direct inter- 
action of the synthetase with the mRNA cannot yet be ex- 
cluded. 
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